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Relative Measurement Errors 
Among Alternative Pension 
Asset and Liability Measures 


Mary E. Barth 
Harvard Business School 


SYNOPSIS: This study investigates the measures of pension assets and 
liabilities disclosed under SFAS 87 to determine which most closely reflect 
those that investors implicitly assess when they value the firm. Several 
measures considered to have conceptual merit are disclosed under SFAS 
87, but no single method has been deemed most appropriate. The 
attributes of the three asset and five liability measurement alternatives 
disclosed and the controversies surrounding the FASB’s deliberations on 
these alternatives are considered in the research design. Because the 
pension asset and liability issues in SFAS 87 relate to measurement and 
reflect a variety of unresolved questions, an approach directly comparing 
measurement error across alternatives is used. The research design utilizes 
relevance and reliability, two primary accounting choice characteristics 
advanced by the FASB. These two attributes are operationalized by the 
, variances and levels of differences between alternative measures and the 
implied investors’ assessment. 
in most prior studies using cross-sectional valuation to address a 
variety of research questions, a model is posited and then tested by using 
book values as substitutes for unobservable variables. The measurement 
error is considered an econometric problem. Although the valuation 
equation assumed here is also based on unobservable market values, the 
measurement error is modeled, and the impact of the measurement error 
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covariance structure on the bias in estimated regression coefficients is a 
fundamental research design feature. In prior research, this covariance 
. structure is often either assumed to be zero (i.e., the error is “white noise”) 
or left unspecified. The model is based on the attributes of historical costs 
for the differences between market values and book value amounts, 
including the candidate pension alternatives. With appropriate 
assumptions, a technique that exploits the implications of measurement 
errors in cross-sectional regression is used to compare levels of measure- 
ment error rather than purge or ignore ft. 

Although several measures of pension assets and liabilities are found 
to be significant in explaining firm market value, differences among the 
alternatives are also significant. The fair value of plan assets and the ac- 
cumulated benefit obligation each exhibit less measurement error than 
other alternatives for the entire sample. The projected benefit obligation 
has less measurement error variance for subsamples in which the salary 
progression rate includes expected inflation and productivity changes. 
These findings suggest that (1) footnote disclosures are closer to those 
assessed in market valuations than are the measures recognized in the 
balance sheet and (2) investors appear to include expectations about future 
salary progression in assessing pension liabilities, but view the projected 
benefit obligation measure as noisy. 


Key Words: Pensions, Gerda error, Security prices, Financial re- 
porting. 


Data Availability: Data used in this study were obtained from public 
sources. A list of sample firms is available from 
the author upon request. 


HE remainder of this paper is organized as follows. Section I summarizes SFAS 

87 disclosures and the controversy leading to its adoption. This controversy pro- 

vides motivation for the analysis. Section II describes the research design, pre- 
sents a model of measurement error in book values and pension alternatives, and 
derives implications for the coefficient bias. Section III presents the empirical results, 
section IV points out several potential limitations of the study, and section V summa- 
rizes and concludes. 


I. SFAS 87 and the Pension Accounting Controversy 


i The current agenda of the Financial Accounting Standards Board (FASB) is con- 
cerned with balance-sheet-related issues! as well as with issues of reported net income. 
In proposing major changes in the pension area, the board cites effects on balance sheet 
accounts, particularly on liabilities. The most recent pension measurement and disclo- ` 
sure standard, Statement of Financial Accounting Standards No. 87, Employers’ Ac- 
counting for Pensions (SFAS 87, FASB 1985) largely became effective in December 
1987. Its balance sheet provisions are of research interest for several reasons. First, ac- 


1 Examples are pensions, income taxes, the reporting entity, financial instruments and off-balance-shest 
financing, other postemployment benefits, and asset impairment. 
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counting for pensions often produces large and pervasive financial statement effects in 
that different alternatives affect relationships between account balances differently.? 
For example, for the sample firms in this study, alternative pension asset and liability | 
measures caused the debt-to-equity ratio to vary between 0.82 and 3.46 on average. 
Second, SFAS 87 changed employers’ accounting for pensions significantly, yet it 
states that “pension accounting ... is still in a transitional stage” (par. 5). This might be 
due to the controversy that surrounded the specifics of the statement. Third, although 
measurement of net periodic pension cost under SFAS 87 affected net income for many 
firms, a stated objective of the FASB was “to improve reporting of financial position” 
(par. 8). 

Although mandated for 1987, many firms elected early adoption. Three alternative 
asset and five liability measures are disclosed in SFAS 87, along with other pension 
plan information. Pertinent to this study are the following: 


1. The fair value of plan assets (PLNA). 

2. The projected benefit obligation (PBO), accumulated benefit obligation (ABO), 
and vested benefit obligation (VBO). ABO differs from PBO by not including ef- 
fects of future compensation levels. VBO is the actuarial present value of vested 
benefits without effects of future compensation levels. 

3. The amount of pension asset (BVPA) and/or liability (BVPL) recognized in the 
balance sheet before the minimum liability provisions become effective.* These 
are calculated as in Accounting Principles Board Opinion No. 8 (APB 8, 1966) in 
that they reflect the difference between net periodic pension cost and the 
amount contributed to the plan,‘ but the pension cost as defined in SFAS 87 is 
used in making the calculations. 

4. The “additional minimum liability’ recognized in the balance sheet. This 
amount is necessary to adjust BVPA and/or BVPL to the excess, if any, of the 
accumulated benefit obligation over the fair value of plan assets (the “minimum 
liability”). FASL, the liability to be recognized under SFAS 87, is the greater of 
BVPL or this “minimum liability.”® FASA, the asset to be recognized under 
SFAS 87, is equal to BVPA unless an additional minimum liability is recognized. 
In that case, an amount equal to the additional minimum liability is recognized 
as the asset if it does not exceed unrecognized prior service cost.. 

5. The weighted average assumed discount rate and rate of compensation increase, 
if applicable.° 


The three pension asset alternatives compared are PLNA, BVPA, and FASA. The five 
pension liability alternatives are ABO, VBO, PBO, BVPL, and FASL. 


2 This study addresses measurement érror in alternative pension asset and liability measures as these affect 
valuation of firms’ common equity. It does not address their effects on firms’ financial ratios or provide eco- 
nomic interpretation of such ratios. 

? These provisions are mandatory after December 1989; 23, 16, and 9 firms in the 1987, 1986, and 1985 sam- 
ples implemented them early. 

* To the extent these differ across years, thé difference is cumulative. Prior to SFAS 87, these differences 
were small. As the sample years fall shortly after adoption of SFAS 87, current-year and cumulative amounts 
are unlikely to be very different. 

8 For 667, 415, and 90 firms in the 1987, 1986, and 1985 samples, FASL is different from zero, and for 254, 
145, and 25 firms FASL is different from BVPL. 
é In the 1987 sample, 1,015 of the 1,082 firms had at least one plan with a nonzero assumed rate of compen- 
sation increase. l 
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Issuance of SFAS 87 followed a long debate over the most appropriate measure of 
pension assets and liabilities to include in the financial statements of employers spon- 
soring pension plans. First, the debate centered on the economic relation between the 
pension plan and the firm, whether the pension plan is a separate entity—the “‘legal 
form” view—or effectively a part of the firm—the “economic substance” view (Miller 
1987).”? Although not achieving it completely in SFAS 87, the FASB adopted an eco- 
nomic substance view® and then the debate focused on the subtle differences among 
candidate alternatives.° Under the economic substance view, the firm’s obligation is to 
employees rendering services rather than to the trust; therefore, pension plan assets 
and liabilities are considered firm assets and liabilities. PLNA, ABO, VBO, and PBO 
measure pension plan assets and liabilities. BVPA and BVPL reflect the legal form view 
in that they are prepaid or unfunded accrued pension plan contributions. This study’s 
empirical results from comparing the amounts recognized (BVPA and BVPL) with the 
disclosed amounts (PLNA and ABO, VBO, or PBO) provide evidence as to which view 
investors appear to hold. 

There is also controversy over which obligation measure (accumulated, vested, or 
projected benefits) is most appropriate under the economic substance view. Some 
advocate recognition of only vested benefits. They argue that employers can avoid obli- 
gations in excess of the vested amount (SFAS 87, par. 148), although the going-concern 
assumption reduces the strength of the argument. Vested benefits are disclosed under 
SFAS 87 but otherwise play no role in calculating financial statement amounts. In con- 
trast, accumulated benefits determine recognition of the minimum liability, and pro- 
jected benefits are used in determining net periodic pension cost. Both accumulated 
and projected benefits are used because of disagreement as to which is more appropri- 
ate. 

Three major areas of controversy between accumulated or projected benefits exist. 
First, in Preliminary Views of the Financial Accounting Standards Board on Major Issues 
Related to Employers’ Accounting for Pensions and Other Postemployment Benefits the 
FASB (1982, par. 61) states that in measuring the future sacrifice based on the present 
obligation, one must consider whether the pension contract defines the obligation as a 
function of future compensation. However, many respondents believed the obligation 
resulting from future compensation increases could not be a liability since it is not the 
“result of past transactions or events.” If it is the result of past transactions or events, 
then projected benefits are more appropriate; otherwise, accumulated benefits are 
more appropriate. Second, the projected benefit obligation calculation requires an 
assumed rate of expected salary increases, including expected inflation and productiv- 
ity components (real wage increases). The salary progression assumption requires 
significant judgment and introduces considerable uncertainty beyond that in the accu- 
mulated benefit obligation calculation. This uncertainty reduces the reliability of the 


z Employers’ accounting for pensions under APB 8 was based on the legal form view. The “legal form” 
versus “economic substance” debate here refers to the relation between the firm and the pension trust and is 
different from the issue of legal versus economic liabilities of the pension plan itself. See, e.g., Francis and 
Reiter (1987) for consideration of this alternate issue. 

” FASA and FASL are hybrid measures resulting from a move from the legal form view toward the eco- 
nomic substance view but also reflect many compromises. 

* In addition to the major conceptual differences discussed here, other measurement-specific issues (e.g., 
determination of interest rates, specification of actuarial cost methods, attribution rules, and recognition of 
gains and losses) were also debated. They are not considered further here, as they are applied to all measures; 
this study focuses on fundamental differences among the measures. 
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measure, potentially affecting the relevance and reliability criteria for recognition 
(Statement of Financial Accounting Concepts No. 2, FASB 1980). Third, some believe the 
accumulated benefit obligation is systematically understated to the extent that future 
salaries reflecting future inflation determine the firm’s pension benefits (SFAS 87, par. 
140). Future inflation is ignored in calculating accumulated benefits, yet all obligation 
measures use the same discount rate, which implicitly includes an inflation com- 
ponent. 

The asset and liability measures recognized under SFAS 87 (FASA and FASL) 
reflect FASB compromises. For example, a liability at least equal to the excess of ac- 
cumulated benefits over plan assets is recognized for underfunded plans but no com- 
parable asset is recognized for overfunded plans. Also, no single liability is identified. If 
a plan’s underfunded position (based on the accumulated benefit obligation and the fair 
value of plan assets) is less than unfunded accrued pension cost, the liability is the 
accrued amount. Otherwise, an additional amount is recognized to increase the liability 
to the underfunded amount. This is viewed in light of the assertion that footnote disclo- 
sure is not a substitute for recognition and “the usefulness and integrity of financial 
statements are impaired by each omission of an element that qualifies for recognition” 
(SFAS 87, par. 116). Evidence on whether these compromises significantly reduce the 
measures’ appropriateness is provided by comparing FASA and FASL with other mea- 
sures. 


II. Research Method 


Several studies examine the income statement components of earnings. For exam- 
ple, Lipe (1986), Barth et al. (1990), and Barth et al. (1991) examine whether different 
coefficients are assigned to different earnings components in various settings. Bowen 
(1981), Bowen and Daley (1982), and Wilson (1986, 1987) are other examples. Some re- 
searchers have examined balance sheet line items, and pension assets and liabilities in 
particular. For example, using a risk model, Dhaliwal (1986) examines whether capital 
market participants view unfunded vested pension obligations as a form of debt. In a 
closely related study, Landsman (1986) examines whether investors view pension assets 
and liabilities disclosed under SFAS 36 as firm assets and liabilities. SFAS 36 is less ex- 
tensive than SFAS 87, which has also increased comparability across firms by elimi- 
nating some previous areas of discretion (e.g, by requiring a single actuarial cost 
method). 

The research question in this study is which measures of pension assets and lia- 
bilities most closely reflect those investors implicitly use in valuing a firm. It is 
addressed using cross-sectional regressions based on the observation that a firm’s 
market value of equity is the sum of its asset and liability market values when liabilities 
are negative amounts.’ These include all assets and liabilities priced by the market, 
whether or not they appear in the firm’s accounting balance sheet. Assets not included 
in the balance sheet may, for example, include such intangible assets as brand names, 


10 This relation is used in Landsman (1986) and assumes spanning and no arbitrage. The model here adopts 
the Miller view that no tax advantage of debt need be incorporated. This is consistent with FASB measures of 
assets and liabilities, which are recognized on a before-tax basis. As the model is asset- and liability-based, there 
is no explicit role for other factors, such as future earnings or quality of management. Implicitly, the asset and 
liability market values, as present values of future cash flows (see Barth 1989}, include these factors. 
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products from research and development efforts, and franchises. Market values are de- 
fined as the amounts implicitly assessed by investors when they value the firm. Follow- 
ing Landsman, the valuation equation used is: | 


MVE=MVA+MVL+MVPA+MVPL, (1) 


- where MVE, MVA, and MVL are the market values of equity, total assets and liabilities 
other than pensions, and MVPA and MVPL are the market values of pension assets and 
liabilities.’ (Liabilities are considered negative values throughout the analysis.) 

The variables on the right-hand side of the equation are unobservable. When book 
values are substituted in equation (1), the differences become measurement error. The 
direction and magnitude of the resulting coefficient bias are generally ambiguous, de- 
pending on the covariance structure of the measurement error. In prior research, this 
covariance structure is often either assumed to be zero (i.e., the error is “white noise”) 
or left unspecified.'? Below, this covariance structure is derived from a model of differ- 
ences between book values and underlying market values, which can be used to unravel 
components of the resulting coefficient bias to examine pension alternatives’ charac- 
teristics. Equation (1) indicates that without such differences, all coefficients would 
equal one.’ The alternative with the smallest measurement error variance is con- 
sidered to possess more relevance and reliability than the others.'* This leads to the fol- 
lowing: ) 


where BVA and BVL are the book values of total nonpension assets and liabilities, and 
PAi and PLi are the ith alternative measure of the pension asset or liability. The esti- 
mated parameters y, through y4 may not equal one because of measurement error in 
BVA and BVL, as well as in PAi and PLi, depending on the covariances among the vari- 
ables and the measurement errors. The intercept, Yo, takes account of measurement 
errors with nonzero means, as well as some concern about cross-sectional correlation 
of residuals.'* Such correlation may result from macroeconomic factors affecting all 
sample firms in the estimation period. To the extent this is a mean effect, the intercept 
absorbs it as well.’ 


u Pension assets and liabilities are implicitly assumed to be assets and Habilities of the firm. Dhaliwal 
(1986), Landsman (1986), Daley (1984), and the results reported in section III find evidence that they are. The 
FASB makes the same assumption when it deliberates over pension alternatives. Also, the off-balance-sheet 
pension asset and liability amounts are positively correlated with the market-to-book difference. The fair value 
of plan assets net of the accumulated benefit obligation has correlations with the excess of market value over 
book value of equity of 0.51, 0.75, and 0.61 in 1987, 1986, and 1985. 

2 See Barth (1989) for a review of this literature. 

Econometric measurement error may also arise from misspecifying the underlying relation. Equation (1) 
assumes a linear relation. If this is not reasonable, the results may be affected. Measurement error may also 
result from lack of a book value counterpart for every asset and liability priced by investors (omitted variables). 
Under the model’s assumptions, it can be shown that the possible presence of such assets and liabilities does 
not affect comparison of the alternatives. 

"7 The difference terms are random variables. Therefore, their “size” is measured by their variances. Using 
variances to assess the degree of measurement error is consistent with standard econometrics texts (e.g., Judge 
et al, 1985, 706-09) and with the measurement error literature (¢.g., Miller and Modigliani 1966). Evidence is 
also presented on magnitude bias in the pension liability measures. See section HI. 

= Footnote 24 describes assumptions concerning cross-sectional correlation in connection with reported 
test statistics. 

"Al regressions were also estimated without an intercept, with results qualitatively the same as those 
reported. As the analysis is comparative and separate-year regressions are estimated, general market move- 
ments should not affect comparisons of the alternatives. 
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Model of Difference Terms and Implications for Coefficient Bias 


Historical cost accounting introduces major differences between recorded book 
values and related market values. The distortion is a function of account turnover 
rate—it is greater for long-term than for current accounts. This suggests that market 
values of current accounts may not be systematically different from their recorded 
amounts and that the book value of current assets and liabilities can be reasonably 
- modeled as market value plus an independent error term." Long-term assets and liabil- 
ities are often initially recorded years before the current balance sheet date and are 
amortized at accounting depreciation rates that reflect economic depreciation only by 
chance (Beaver 1989). As a result, the difference terms between book and market values 
are more complex for long-term accounts as compared to current accounts. The 
sources and nature of the difference terms are derived and more fully described in 
Barth Dong 

The model of book values of nonpension assets and liabilities in terms of their 
underlying market values is as follows: 


BVA=MVA+tuUzrmtUca; BVL= MVL Gro Les, - (3) 


where CA, CL, LTA, and LTL denote current- and long-term assets and liabilities. Uca, 
Ucz, Urra, ANd Urz are assumed to have cross-sectional means that may not equal zero. 
Because of their nature, Uzra and uzr are assumed to be correlated with each other and 
with all underlying asset and liability market values. For the sake of parsimony, uc, and 
Uc, are modeled as if they were independent.’ All market values are presumed to be 
correlated with each other. Each difference term, u., has variance oi, Covariance 
between x and y is denoted ø, ,. The pension asset and liability alternatives are modeled 
as: 


PLi=s;(MVPL + Urr), PAi=MVPA+upa:; (4) 


where Ga, and upa: are uncorrelated with other variables or with their measurement 
error.”° All measures are current as of the financial statement date because SFAS 87 
requires reporting current estimates of market values and using current pension plan 
assumptions in calculating other reported amounts.”! 

Including s, in equation (4) recognizes that different pension liability measures 
have, essentially by construction, different magnitudes. Generally, BVPL<FASLs 


VU Last-in, first-out (LIFO) inventories are an obvious potential exception. In the empirical work, the inven- 
tory-carrying amount of firms using LIFO is adjusted by the LIFO reserve. 

“ The valuation approach there is derived from the consumption-based models of Rubinstein (1976) and 
Lucas (1978). 

‘9 Ifthe assumption that Ge and uc, are independent variables is relaxed, Y,,...,¥, defined in the next sub- 
section will include the covariance terms. As long as uca and uq are uncorrelated with upa: and Urz the 
covariances will not change with a change in pension alternative, and alternative comparisons are unaffected. 

2 Previous literature on pension funding provides evidence that pension assets and liabilities are endog- 
enously determined (see, e.g., Francis and Reiter 1987). Funding assumptions partially determine the level of 
pension assets and (indirectly) liabilities, but should have little impact on error in their measurement. Evidence 
also exists that discount rates, which affect the measurement of the pension liability, disclosed under SFAS 36 
are related to firms’ profitability, leverage, and tax status. This does not appear to hold for the sample firms and 
SFAS 87 rates. Discount rates are generally uncorrelated with leverage (long-term debt/book value of common 
equity and debt/total assets), net income, and tax rates. 

2 Although the market value of plan assets may be measured essentially without error, a difference may 

arise, for example, if investors do not view all plan assets as firm assets. 
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VBO<ABOsPBO (in absolute terms). As a result, they cannot all measure the under- 
lying liability without systematic bias, and a model without a scale factor would be in- 
appropriate. Alternative i is considered to have less measurement error than alternative 
j if 02,,,<02,,. Thus, an alternative with nonzero measurement error variance (but, per- 
haps, no magnitude bias) is considered to have more measurement error than an alter- 
native with no measurement error variance but, perhaps, a large magnitude bias. The s, 
values are estimated separately as evidence of systematic magnitude bias in the mea- 
sures. Although cross-sectional constancy of the s; is a simplifying assumption neces- 
sary for estimation, it is not unreasonable. Many firm-specific (and industry-related) 
differences are captured in actuarial and other assumptions used to calculate all alter- 
natives. Firm-specific differences in MVPL already reflected in the alternatives do not 
. impact the s; (or up;,). In addition, there should be no systematic differences among 
groups of firms because of the neutrality requirement in standards setting.” (Appendix 
B offers empirical evidence on the reasonableness of the assumption about the s,. It also 
presents pension liability comparison results without scaling by s, which are somewhat 
different from those reported in section III.) 


Detecting Changes in Measurement Errors 


The underlying relation is stated by equation (1), but equation (2) is the one being es- 
timated. Estimates of y,, in equation (2) are biased estimates of the theoretical coeffi- 
cient of 1 due to measurement error. Expressions for the bias, the difference between 1 
and Yw, are derived in appendix A by using the difference terms model developed 
above. These expressions include the pension alternatives’ measurement error vari- 
ances, o2,,, and ož, as parameters. Evidence on these parameters is obtained by 
estimating terms containing these variances, comparing them across alternatives, and 
assessing the statistical significance of the differences. Although differences due to 
magnitude bias in the alternative measures also contribute to measurement error and 
evidence on such bias is presented, the analysis focuses first on precision. 

Equation (2) is estimated by setting y,.=1—B,., where B,, is the coefficient bias de- 
rived in appendix A.” Four B,, terms containing measurement error variances and co- 
variances, Y,,...,Y,4, are defined in appendix A, equation (A5). These bias components 
are of primary interest. Including different pension asset or liability measures (PAi or 
PLi) in the estimating equation results in different coefficient bias (and therefore dif- 
ferent Y,,. . .,Y4) because they have different measurement errors. Y,, Ya, and parts of 
Y, and Y, are constant across pension alternatives, yet other parts of Y; or Y, change 
when one pension asset or liability alternative is replaced with another, as they contain 
o2,,, and Gun The higher the measurement error variance (0?2,,, or 02,,,), the smaller is 


2 Because of e, in equation (4), the MVPL coefficient in equation (1) actually equals 1/s, rather than 1. To 
account for this in estimation, alternatives are scaled by estimates of syso/s,. As the alternatives’ sample means 
are estimates of s, E { MVPL}, dividing the mean of VBO by the means of ABO, PBO, BVPL, and FASL, provides 
consistent estimates of 8yz0/s,. The discussion proceeds as if 8vao= 1. This ls not an assumption of the model. It 
can be shown that the ranking of alternatives is not affected by the value of Syao, as the technique is compara- 
tive. The model was also estimated by using PBO as the base, with no qualitative difference in results. 

” Cross-sectional estimation requires assuming that variances and covariances in B,, are cross-sectional 
constants. Although FASB decisions apply equally to all firms and this assumption is consistent with that ap- 
proach, it is not expected to hold precisely. However, there was no qualitative difference in results when the 
analyses were repeated with various subsets of sample firms and deflators. See appendix B. 
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Y, or Ys. Ranking Y, or Y, from highest to lowest is equivalent to ranking the related 
alternatives from lowest measurement error to highest. 

Estimation details (including the equations estimated) are in appendix A. It pro- 
ceeds as follows. First, several “nuisance” parameters are estimated in auxiliary regres- 
sions. Next, an unrestricted system of equations is estimated comprising equation (2), 
setting y,.=1—B,,, for each pension asset (or liability) alternative for a given liability (or 
asset) alternative, where Y, (or Y4) differ across alternatives; this provides a point esti- 
mate ranking of the Y, (or Y4). Since disturbance terms are likely correlated across 
equations, Seemingly Unrelated Regression is used to estimate the system of equations. 
To test statistical significance of the point estimate rankings of Y, (or Y4), the system is 
then estimated with Y, (or Y,) restricted to be equal for the alternatives being compared. 
This restriction is equivalent to restricting the measurement error variances for the 
pension assets and/or liabilities to be the same. The Gallant and Jorgenson (1979) x? 
statistic tests significance of these restrictions (degrees of freedom equal the number of 
restrictions).% For the sake of parsimony, significance tests for only the following lia- 
bility comparisons are reported: ABO with VBO, ABO with PBO, VBO with PBO, these 
three together, BVPL and FASL with their closest competitors, and all five together. 

Heteroscedasticity causes inefficient coefficient estimates and potentially incorrect 
reported standard errors. The White (1980) x? test rejects homoscedasticity at conven- 
tional levels for most regressions.’ When rejected at the ten percent level, heterosce- 
dasticity-consistent standard errors (White 1980) are used to calculate reported 
t-statistics. Coefficients with t-statistics greater in absolute value than 1.86 are con- 
sidered significant.” ` 


ITI. Results 
Data and Sample Firms 


Sample firms include all Compustat firms that (1) were publicly traded, (2) have 
available common stock prices, and (3) disclose information under SFAS 87. Most vari- 
ables are Compustat data items, and others were derived from available data. Discount 
rates and rates of compensation increase were not available on Compustat and were ob- 
tained from annual reports and Forms 10-K. BVA and BVL are total assets and liabilities 
less prepaid or unfunded accrued pension cost (BVPA and BVPL). ABO, VBO, PBO, 
and PLNA were separately disclosed, and FASA and FASL were calculated from foot- 
note data. MVE is the fiscal year-end closing stock price times the number of outstand- 
ing shares. Although published in December 1985, SFAS 87 permitted early adoption, 


3 This test statistic assumes asymptotic normality of coefficient estimates. Independence and normality of 
sample disturbance terms are not assumed. Hansen (1982) shows the weaker conditions of stationarity and 


. ergodicity of the observable variables are sufficient for asymptotic normality of estimators, which can be 


_ viewed as generalized method of moments estimators, as can the estimators used here. The Gallant and Jorgen- 
son (1979) setting is consistent with Hansen (1982), which implies that stationarity and ergodicity are also suffi- 

cient here. The stationarity assumption may be relaxed, given further restrictions on variables’ higher order 
moments. Ergodicity, however, is a time-series concept not readily interpretable in a cross-sectional setting. 
Therefore instead, mean independence of error terms, a condition weaker than independence, yet interpret- 
_ able in a cross-sectional setting, is assumed. The », are also assumed to have finite variances. 

235 The statistic also tests model specification. See appendix B for additional specification checks. 

76 The reported t-statistics should be viewed as descriptive statistics, Le., no Claim of normality in finite 
samples is made or implied. 
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Table 1 
Descriptive Firm Statistics 
Firm Standard 
Characteristics Mean Deviation Median 
MVE" 
1987 1314.57 3932.17 183.06 
1986 1577.32 4562.81 315.02 
1985 826.10 1899.50 255.26 
BVA” 
1987 2773.03 8878.82 408.87 
1886 2963.51 9062.58 543.14 
1985 1298.82 3038.51 388.40 
BVL" l 
1987 — 1922.60 — 6986.50 — 217.69 
1988 — 1994.60 — 7250.88 —~ 303.33 
1985 — 767.61 — 2026.68 — 185.50 
Sales* 
1987 2154.22 6472.29 413.01 
1986 2420.50 6880.89. 612.39 
1985 1484.75 3193.00 484.21 
Employees 
1987 17.26 49.21 3.78 
1986 20,92 57.01 5.29 
1985 12.20 22.96 4.00 
Market-to-book ratio 
1987 1.70 3.75 1.35 
1988 1.82 1.47 1.57 
1985 1.89 1.48 1.86 
Common stock beta 
1987 1.06 0.37 1.10 
1986 1.07 0.33 1.10 
1985 1.07 0.38 1.16 





MVE= Market value of common equity. 

BVA = Book value of nonpension assets. — 

BVL= Book value of nonpension liabilities. 

e Dollars in millions. 

* Thousands. 

Nobs for MVE, BVA, and BVL equa! nobs used in regressions, nobs for other variables are somewhat 
fewer. l 


but was required beginning with December 1987 fiscal years. Firm observations for this 
first mandatory year comprise the 1987 sample.” Firm observations for fiscal years 
ending from December 1986 through November 1987 and December 1985 through 
November 1986 comprise the 1988 and 1985 samples. This results in 1,082 firms for 
1987, the mandatory disclosure year, and 702 and 150 firms for 1986 and 1985, the early 
adoption years. l 

Descriptive size and market statistics for sample firms are presented in table 1. The 


2 The 8 July, 1988 Compustat database, including currently and previously traded firms, was used. The 
latest fiscal year observations generally were from March 1988. 
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1986 sample firms are, on average, the largest and 1985, the smallest. The 1985 firms 
have the largest average market-to-book ratio (based on recognized amounts that in- 
clude BVPA and BVPL for pensions), and 1987 have the smallest. Firms in all three 
years have similar common stock betas. Approximately 60 percent of the firms trade on 
the New York Stock Exchange, 10 percent on the American Stock Exchange, and 30 
percent over-the-counter. Sample firms are not industry-concentrated. Sixty, 56, and 43 
different two-digit primary SIC codes are represented in 1987, 1986, and 1985, with the 
largest percentage representation of any one code of 7.4 percent, 9.0 percent, and 11.3 
percent, respectively. Table 2 contains descriptive statistics for pension variables both 
undeflated and deflated by market value of equity. Pension asset and liability measures 
are, on average, a smaller percentage of market value of equity for 1985 firms than for 
the other years. In all three years, however, each of the projected, accumulated, and 
vested benefit obligations and the fair value of plan assets constitute a noticeable per- 
centage of market value of equity. On average they range from 17 percent (1985 VBO) to 
34 percent (1986 PLNA). There is little variation in discount rates or in the rates of com- 
pensation increase across years. The average discount rate ranges from 8.4 to 8.8 per- 
cent, and the rate of compensation increase ranges from 5.6 to 6.1 percent. 


Empirical Results—No Consideration of Measurement Error 


As a benchmark, equation (2) is estimated with OLS for each alternative and without 
specifying 7,.=1~—B,,. The results are presented in tables 3 through 5. All three pension 
asset alternatives are matched with all five pension liability alternatives for complete- 
ness although some asset and liability combinations may seem conceptually implausi- 
ble, such as ABO, VBO, or PBO matched with FASA or BVPA or PLNA matched with 
BVPL or FASL. The tables segregate these possibly implausible combinations as noted. 

When PLNA measures pension assets and pension liability alternatives are com- 
pared, many of the coefficient estimates are significantly different from both zero and 
one.” In four instances, the coefficients on disclosed amounts are significantly differ- 
ent from zero but not significantly different from one. This is not the case for the 
amounts recognized, or to be recognized, in the balance sheet (BVPL or FASL), even 
when BVPA or FASA are the pension asset measures. However, signs and magnitudes 
of many coefficients vary with the liability alternatives. When BVPA measures pension 
assets, coefficients on disclosed pension liability measures generally have the wrong 
sign in 1987 and 1986 (a third are significantly different from zero and all are signifi- 
cantly different from one), as do those on BVPA. BVPA coefficient estimates range 
from — 29.76 to 2.44, and all but one are significantly different from zero (and one). In 
1985 the coefficients on BVPA are not significantly different from zero or one, and the 
pension liability coefficients are significant and have the correct sign. When FASA 
measures pension assets, several coefficients on the pension asset and liability alterna- 
tives are again negative, and many are significantly different from zero (and one). With 
all three asset measures, most BVA and BVL coefficients are also significantly different 
from one (and zero). 

As a further benchmark, when MVE was regressed on only BVA and BVL, the 
result was adjusted R? of 0.81, 0.82, and 0.85 for 1987, 1986, and 1985. In 43 of the 45 


28 The square bracketed t-statistics in tables 3 through 5 indicate that the related coefficient estimate is 
significantly different from one at less than the five percent level. 


Pension 


Variables 


PBO 
1987 
1986 
1985 


ABO 
1987 
1986 
1985 


VBO 
1987 
1986 
1985 


BVPL 
1987 
1986 
1985 


FASL 
1987 
1986 
1985 


PLNA 
1987 
1986 
1985 


BVPA 
1887 
1986 
1985 


FASA 
1987 
1966 
1985 


MVE= Market value of common equity. 


Mean 


— 377.63 
— 403.87 
— 144.98 


— 319.56 
— 335.94 


. — 120.81 


— 293.63 
— 308.66 
— 114.46 


— 12.68 
— 14.68 
— 6.00 


—16.45 
.—14.72 
—6.42 


438.35 
464.67 
192.09 


11.12 
8.97 
2.50 


14.11 
8.63 
2.76 


Dollars in Millions 


Std. 
Dev. 


1770.20 
1902.22 
531.41 


1535.96 
1638.15 
471.09 


1398.49 
1469.83 
462.11 


50.98 
69.51 
23.31 


112.13 
71.10 
25.15 


2157.59 
2176.04 
819.01 


99.35 
82.85 
9.08 


128.37 
82.51 
9.94 


PBO = Projected benefit obligation. 
ABO= Accumulated benefit obligation. 
VBO= Vested benefit obligation. 


BVPL=Pension liability currently recognized. 
FASL=Pension liability to be recognized under SFAS 87. 


PLNA=Fair value of pension assets. 


BVPA= Pension asset currently recognized. 


Table 2 
Descriptive Statistics for Pension Variables 


Median 


— 24.49 
~ 45.91 
— 27.80 


— 19.27 
~ 35.65 
~~ 21.00 


17.14 
~ 31.40 
~~ 18.12 


~ 0.42 
0.49 
— 0.37 


~ 0.30 
— 0.30 
EE 0.22 


26.77 
54.46 
39.30 


0.00 
0.00 
0.00 


0.00 
0.00 
0.00 


FASA = Pension asset to be recognized under SFAS 87. 
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Mean 


— 33.12 
— 31.74 
—22.17 


— 28.86 
— 27.22 
— 18.41 


— 28.90 
— 25.32 
—17.07 


— 3.05 
— 2.11 
—1.25 


—4.03 
— 2.58 
— 1.30 


33.19 
34.41 
26.80 


0.95 
0.80 
0.61 


1.47 
1.03 
0.55 


Percent of MVE 


Std. 
Dev. 


81.95 
75.81 
28.49 


78.04 
69.51 
25.22 


- 74.88 


68.59 
24.07 


15.83 
8.20 
3.07 


24.18 


11.87 
3.28 


71.85 
77.98 
32.61 


3.96 
3. 22 
1.94 


7.78 
6.04 
1.77 


Median 


~~ 14.22 
~ 15.41 
~~ 12.89 


11.16 
— 11.78 
~- 10.30 


~~ 10.10 
— 10.41 
— 9.14 


— 0.34 
0.23 
— 0.26 


~0.25 
~~ 0.17 
— 0.19 


16.06 
17.93 
16.13 


0.00 
0.00 
0.00 


0.00 
0.00 
0.00 


cases reported in tables 3 through 5, when pension variables are included, the R? in- 
crease is statistically significant at levels below 0.1 percent (with a standard F test). 
The OLS results suggest that several measures of pension assets and liabilities are 
significant in explaining firms’ market value of equity; yet the levels and, in some cases, 
the signs of the coefficients are different from the theoretical value of one. This sug- 
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Table 3 
Regressions Without Consideration of Measurement Error 


MVE=%70+7:BVA+72BVL+y3PLNA + y4PLi 


1987 





MVE=Market value of common equity. 
ABO=Accumulated benefit obligation. 
VBO=Vested benefit obligation. 
PBO=Projected benefit obligation. 
BVPL=Pension Hability currently recognized. 
FASL=Pension liability to be recognized under SFAS 87. 
PLNA =Fair value of pension assets. 
BVA =Book value of nonpension assets. 
BVL=Book value of nonpension liabilities. 
(t}=t-statistics. A t-statistic in square brackets indicates the related coefficient is significantly dif- 
ferent from one at five percent level (for variables only). 
PLNA combined with ABO, VBO, and PBO (above dotted lines) may be most plausible. 


gests that measurement errors may be causing large coefficient bias. In addition, mea- 
surement errors appear to differ between pension asset and liability measures. The PAi 
and PLi coefficients differ significantly from each other in more than 50 percent of the 
cases (both for all pairs and “plausible” pairs). This further suggests that netting the 
two measures may obscure their measurement error characteristics. 
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Table 4 
Regressions Without Consideration of Measurement Error 


MVE=70+7:BVA+72BVL+7sBVPA+ y4PLi 


1987 


[10.43] | 1. [9.93] 
[10.41] | 1. [9.90] 
[10.08] | 1. [9.64] 


[18.32] | 1.50 [16.72] 
[14.42] | 1.55 [13.32] 


[13.51] 
[13.36] 





MVE =Market value of common equity. 
ABO=Accumulated benefit obligation. 
VBO =Vested benefit obligation. 
PBO=Projected benefit obligation. 
BVPL=Pension liability currently recognized. 
FASL=Pension liability to be recognized under SFAS 87. 
BVPA=Pension asset currently recognized. 
BVA=Book value of nonpension assets. 
BVL=Book value of nonpension liabilities. 
(t}=t-statistics. A t-statistic in square brackets indicates the related coefficient is significantly dif- 
ferent from one at five percent level (for variables only). 
BVPA combined with BVPL and FASL (below dotted lines) may be most plausible. 


Empirical Results—Comparison of Pension Alternatives 


Results from comparing pension asset and liability alternatives when measurement 
error is considered are presented in tables 6 and 7. Generally, the results are similar 
across years. In the summary exhibits below, possibly implausible counterparts are 
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Table 5 
Regressions Without Consideration of Measurement Error 


MVE=%0+7:BVA+72BVL+4+ ysFASA + y4PLi 


1987 


1.43 [8.97] 
1.43 [8.89] 
1.33 [9.17] 


[16.96] | 1.44 [15.83] 
[18.64] | 1.49 [15.29] 


1.42 [7.39] 
1.42 Daag 
1.26 (7.90) 


[14.46] | 1.51 [13.61] 
[14.55] | 1.53 [13.63] 


[22.90] | 1.56 [17.61] ; ; [4.00] 
[7.55] | 1.55 [8.91] ; l 8. [2.29] 





MVE=Market value of common equity. 
ABO=Accumulated benefit obligation. 
VBO=Vested benefit obligation. 
PBO=Projected benefit obligation. 
BVPL=Pension liability currently recognized. 
FASL=Pension liability to be recognized under SFAS 87, 
FASA =Pension asset to be recognized under SFAS 87. 
BVA =Book value of nonpension assets. 
BVL=Book value of nonpension liabilities. 
(t)=t-statistics. A t-statistic in square brackets indicates the related coefficient is significantly dif- 
ferent from one at five percent level (for variables only). 
FASA combined with BVPL and FASL (below dotted lines) may be most plausible, 


marked by asterisks. It is noteworthy that even when matched with possibly implausible 
counterparts, the disclosed liability amounts (e.g., ABO, VBO, and PBO) show less mea- 
surement error than the “plausible” alternatives in some cases. This is not the case for 
the amounts recognized (BVPA, BVPL) or to be recognized (FASA, FASL) in the bal- 
ance sheet. 
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Table 6 
Regressions Setting y= 1—-B Comparing Pension Asset Alternatives 
Estimates of Y, = —(Onvra ura + Saved wull 
PLi= ABO PLi = VBO PLi= BVPL 
PAI 1887 1986 1985 1987 1986 1985 1987 1986 1885 

PLNA 270.43 321.79 32.18 247.12 285.21 31.36 — 90.37 — 37.26 — 17.51 

(t) (27.12) (25.58) (8.58) (24.60) (23.31) (8.38) (—-8.25) (—2.52} L-4219 
BVPA 6.41 5.81 0.32 5.58 4.98 0.31 ~ 0.37 0.16 0.05 

Im = (16.41) (13.67) Doan (14.28) (11.78) Dau (— 1.18) (0.45) (1.28) 
FASA 6.91 5.71 0.32 5.99 4.86 0.31 ~ 0.34 0.15 0.02 

(t) (14.89) (13.53) (10.02) (12.95) (11.63) Den L-3101 (0.42) (0.49) 
nobs 1,042 671 145 1,042 671 145 1,042 671 145 
ABO, VBO combined with PLNA; BVPL with BVPA, FASA may be most plausible. 

x?* for Hypothesis Tests 
PLi=ABO . PLi=VBO PLi= BVPL 
Hypothesis 1987 1986 1985 1987 1986 1985 1987 1966 1985 

On = Ooms 751.1 667.3 73.2 618.9 555.9 70.0 701 66 17.7 
GET, Pe 754.5 667.3 73.3 621.8 556.0 70.0 70.4 6.6 17.6 
Zen = DZ ane 7.9 8.6 0.5 5.2 7.3 0.6 0.0 0.2 8.9 
o2,,,=02,,, all i, j 764.3 667.6 130.6 630.7 556.3 117.8 71.8 7.2 24.0 
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ABO= Accumulated benefit obligation. 

VBO= Vested benefit obligation. 
BVPL= Pension liability currently recognized. 
PLNA = Fair value of pension assets. 
BVPA = Pension asset currently recognized. 
FASA = Pension asset to be recognized under SFAS 87, 

BVA = Book value of nonpension assets. 

BVL = Book value of nonpension liabilities. 


Se 





Hun Y4 are generally significantly different from zero. In accordance with the 
benchmark regressions, this suggests significant bias in the coefficients and indicates 
that at least one bias term is not equal to zero. However, it does not capture the mea- 
surement errors in the pension alternatives. Even if Y; or Y, were equal to zero, there 
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Table 7 
Regressions Setting y= 1—B Comparing Pension Liability Alternatives 
Estimates of Yas (Ose wt ee E Gut. arn} — 02 
PAi=PLNA PAi=BVPA PAI= FASA 
PLi 1987 1986 1985 1987 1986 1985 1987 1986 1985 
ABO 68.37 —22.89 15.15 -—192.40 —220.32 -32.02 —214.30 —219.11 -32.01 
(t) (10.88) (~3.08) (7.91) (-32.76) (—28.85) (—17.10) (—52.38) (—29.12) (—17.05) 
VBO 66.14 -—24.20 15.88 -~191.88 -—223.00 -33.25 —212.78 —221.76 -57.71 
D (10.70) (-—3.31) (7.98) (-—33.20) (—30.09) (—17.09) (—52.64) (—30.36) (—17.05) 
PBO 80.52 —32.18 13.77 -196.61 —237.28 -—30.23 -—211.22 -236.02 ~30.20 
(t) (10.00) (~4.46) (7.67) (—35.52) (—32.30}) (—17.23) (—53.08) (—32.60) (—17.16) 
BVPL —101.67 -192.37 -16.61 —207.168 -319.468 -23.42 -—199.66 —317.30 —23.45 
D (~35.56) {(—37.98) (—21.47) (-—68.95) {-56.83) (—21.47) (-71.56) (—57.75) (—20.98) 
FASL -375.59 -207.87 —17.47 —467.82 -326.11 —23.92 -355.80 -325.05 —23.40 
D (—118.21) (—40.51) (—22.40} (—153.64) {—58.04} (—21.89) (—89.52) (—60.04) (—20.82}) 
nobs 1,042 671 145 1,042 671 145 1,042 671 145 
PLNA combined with ABO, VBO, PBO; BVPA, FASA with BVPL, FASL may be most plausible. 
(continued) 


could be significant measurement error in the pension alternatives because such 
measurement errors are only one component of Y, and Y4. 

The results of the pension asset alternatives comparison indicate that, when either 
the accumulated or vested benefit obligation (ABO or VBO) measures pension liabil- 
ities, the fair value of plan assets exhibits far less measurement error than either the 
amount currently recognized (i.e., for the sample years in this study) or to be recog- 
nized under SFAS 87. The x? statistics exceed 70 in all years (and exceed 550 in 1987 
and 1986). When BVPL measures pension liabilities, the ranking is reversed. The x’ 
statistics are again significant (except for BVPA=FASA in 1987 and 1986) but much 
smaller than when either ABO or VBO is the liability measure. 

The complete asset alternative rankings are listed in exhibit 1 from lowest to 
highest measurement error. For example, table 6 reports when the accumulated benefit 
obligation, ABO, measures pension liabilities in 1987, the estimate of Y, containing 
d?a 18 270.43, the estimate of Y, containing o?,,,, is 6.91, and the estimate of Y, con- 
taining o?,,,, is 6.41. Therefore, the fair value of plan assets (PLNA) is ranked as having 
less measurement error than the asset to be recognized under SFAS 87 and the asset 
currently recognized (BVPA). 

The five liability alternatives’ ranking, based on magnitudes of the Y, reported in 
table 7 for the three pension asset alternatives, is presented in exhibit 2. The results gen- 
erally indicate that the accumulated benefit obligation exhibits the least measurement 
error. The x? statistics indicate significant differences except for FASL=BVPL in 1985, 
when the amount to be recognized under SFAS 87 is the asset measure. 

Research design modification enabling direct testing of alternative pairs without a 
common element is left for future research, and so inferences in this study are limited 


» Results for liability measures exhibiting the least measurement error for each asset alternative, ABO, 
VBO, and BVPL, are reported in table 6. Unreported results for PBO and FASL are similar. 
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Table 7—Continued 
x?* for Hypothesis Tests 
PAi=BVPA 
1985 1987 1986 
86.0 12.9 65.7 
104.9 40.1 271.1 
104.5 75.8 322.8 
105.7 418.4 335.5 
381.8 7.0 171.9 
Wé aa kk 
LE KE Sé 
38.2 53,599.8 412.0 
725.9 96,101.7 1,113.2 
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PAI=FASA 
1985 1987 1986 
142.8 285.3 65.4 
143.5 52.7° 280.6 
171.9 19.6 336.7 
173.1 437.1 351.0 
+4 17.2 177.0 
** 1,468.8 + 
13.8 Se ae 
8.7 Sr 718.7 
245.7 73,900.00 1,858.0 
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ae oe not ‘tested: GC for ABO, VBO, and PBO, only GEN alternatives are tested. 
ABO= Accumulated benefit obligation. 


VBO = Vested benefit obligation. 
PBO = Projected benefit obligation. 


BVPL= Pension lability currently recognized. 


FASL= Pension liability to be recognized under SFAS 87. 


PLNA = Fair value of pension assets. 
BVPA = Pension asset currently recognized. 


FASA = Pension asset to be recognized under SFAS 87. 
BVA = Book value of nonpension assets. 
BVL= Book value of nonpension liabilities, 
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to comparing either the pension asset or liability measures holding the other constant. 
Yet, some evidence is provided by comparing Y3;+ Y, for pairs of assets and liabilities 
with standard t-tests. For example, when the pairs [PLNA:ABO], [BVPA:BVPL], and 
[FASA:FASL] are compared by summing within pair estimates of Y, and Y, in all three 
years the pair [PLNA:ABO] shows significantly less measurement error than the other 
two pairs (t= 44.90, t=39.40, and t= 16.50 in 1987, 1986, and 1985). Only in 1987 are 
the pairs [BVPA:BVPL] and [FASA:FASL] significantly different from each other 


* This test allows for different variances but assumes independence, 
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Exhibit 1 


Pension Asset Measurement Error Ranking Lowest to Highest (1-3) 
Matched with Three Pension Liability Alternatives 





ABO VBO BVPL 
Rank 1987 1986 1985 1987 1986 1985 1987 1986 1985 





1. PLNA PLNA PLNA PLNA PLNA PLNA FASA BVPA BVPA 
2. FASA* BVPA* FASA* FASA* BVPA*” FASA* BVPA FASA FASA 
3. BVPA* FASA* BVPA* BVPA* FASA* BVPA*. PLNA* PLNA* PLNA* 





ABO [= Accumulated benefit obligation. 
VBO = Vested benefit obligation. 
BVPL= Pension liability currently recognized. 
PLNA=Fair value of pension assets. 
`” BVPA= Pension asset currently recognized. 
FASA = Pension asset to be recognized under SFAS 87. 
* Possibly implausible counterpart. 


Exhibit 2 


Pension Liability Measurement Error Ranking Lowest to Highest (1-8) 
Matched with Each Pension Asset Alternative 


PLNA BVPA FASA 
Rank 1987 1986 1985 1987 1986 1985 1987 1986 1985 


ABO ABO VBO VBO* ABO* BVPL BVPL ABO* BVPL 
VBO VBO ABO ABO* VBO* FASL PBO* VBO* FASL 
PBO PBO PBO PBO* PBO* ABO* VBO* PBO* ABO* 
BVPL* BVPL* BVPL* BVPL BVPL PBO* ABO* BVPL PBO* 
FASL* FASL* ` FASL* FASL FASL VBO* FASL FASL VBO* 


Tb oo fo p 


PBO= Projected benefit obligation. 

ABO= Accumulated benefit obligation. 

VBO= Vested benefit obligation. 
BVPL = Pension liability currently recognized. 

FASL= Pension liability to be recognized under SFAS 87. 
PLNA = Falr value of pension assets. 
BVPA= Pension asset currently recognized. 
FASA=Pension asset to be recognized under SFAS 87, 
* Possibly implausible counterpart. 


(t= 20.46, t=0.70, and t=0.28), and then [BVPA: BVPL] shows less measurement error 
than [FASA:FASL]. | 

So far, the variance of the measurement errors has been the focus of the analysis. 
However, as previously noted, estimates of s, provide evidence on pension liability al- 
ternatives’ magnitude bias. Without additional assumptions, the s; and Yi,...,Y. can- 
not be simultaneously estimated. Therefore, the s, values are estimated separately with 
an instrumental variables approach. In this approach, each pension liability alternative 
is regressed on instruments in the first stage. The predicted values from these regres- 
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Table 8 
Instrumental Variables Estimates of s, 


(MVE—BVA-BVL-— PAi)=a+1/s,PH Le 





PAi=PLNA PAi=BVPA PAi= FASA 
PLi 1987 1986 1985 1987 1986 1985 1987 1986 1985 
Banc 1.21 1.39 1.13 —1.90 —1.79 —1.13 — 1.92 —1.79 —1.13 
(t-1) (3.34) (3.90) (0.60) (—17.07) (—13.89) (-—9.56) (—16.82) (—13.93) (—9.54) 
(t-0) (19.47) (13.80) (5.15) (—11.19) (—891) (—5.08) (—11.06) (—8.94) (—5.07) 
Boro 1.12 1.31 1.11 —1.68 — 1.50 —4.13 — 1.68 —1.50 —1.13 
(t-1) (2.00) (3.07) (0.54) (—18.73) (—16.01) (—10.72) (—18.43) (—16.03) (—9.48) 
(t-0) (18.67) (12.99) (5.20) (—11.69) (—9.81) (—5.68) (—11.54} (—9.61} (—5.03) 
8 ps0 1.66 2.12 1.29 == 1.77 —1.58 —1.28 —1.78 —1.60 —1.29 
/ (t-1) (6.11) (5.22) (1.16} (—22.38) (—31.65) (—9.11) (—22.09} (—20.19) (—9.09) 
(t-0) (15.37) (9.87) (5.08} (—14.30) (—19.39) (—5.12) (—14.15) (—12.42) (—5.11) 
Sau 018  —0.07 om um nm 0.25 um nm 0.25 


(t-1) (—14.10) (—106.33) (—54.44) (—866.80) (—809.40) (—2.46) (—858.15) (—809.27) (—2.57) 
(t-0) (—2.17)  (—7.22) (3.20) (—17.83) (—20.57) (0.83) (—17.82) (—20.56) (0.84) 


Sean 0.22 — 0.07 0.06 —0.09 —0.03 0.25 —0.09 —0.03 0.25 
(t-1) (—11.54) (—109.90) (—53.29} (—103.78) (—773.85) (—2.866) (—94.55) (—773.27} (—2.78) 
(t-o) (3.20) L-3231 (3.28)  (—8.45) (—19.70) (0.89)  {—8.02) (—19.69) (0.91) 


PLNA combined with ABO, VBO, PBO; BVPA, FASA with BVPL, FASL may be most plausible. 
MVE= Market value of common equity. 
ABO = Accumulated benefit obligation. 
VBO = Vested benefit obligation. 
PBO = Projected benefit obligation. 
BVPL= Pension liability currently recognized. 
FASL= Pension liability to be recognized under SFAS 87. 
PLNA =Fair value of pension assets. 
BVPA = Pension asset currently recognized. 
FASA= Pension asset to be recognized under SFAS 87. 
BVA = Book value of nonpension assets, 
BVL=Book value of nonpension liabilities. 
(t-1}=t-statistic for null hypothesis that estimate equals one. 
(t-0}=t-statistic for null hypothesis that estimate equals zera. 


sions (PLi) replace PLi in the following second-stage regression: 
(MVE-—BVA-—BVL-—PAi)=a+8PLi+e. (5) 


Note that e is the sum of the first-stage residuals and all variables’ difference terms. If 
the difference terms in e are uncorrelated with the instruments (and therefore with 
PTi), then 1/8 is a consistent estimate of e, However, there is some cost in efficiency. 
MVE, number of employees, sales, net income, and pension and retirement expense are 
used as instruments in the first stage.” 

Table 8 presents the estimation results of s; that complement the variance analyses. 
The results clearly suggest that the disclosed liability amounts exhibit less magnitude 


a Because of potential correlation between the instruments and difference terms, the estimation was re- 
peated by using three subsets of instruments. The results were similar. 
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bias than the recognized amounts. Sensible estimates for s, are obtained only when 
PLNA and VBO, ABO, or PBO are the pension asset and liability measures. 

Estimates of Svso are close to one in magnitude (1.12, 1.31, and 1.11 in 1987, 1986, 
and 1985) when PLNA is the asset measure, even though they are statistically different 
from one in 1987 and 1986. Estimates of 8459 and Spso are somewhat larger, but still rela- 
tively close to one. In 1985, the null of s,=1 cannot be rejected at conventional levels for 
8480» Svzo; OF Spao (the null of s,=0 is rejected). In contrast, estimates of Ssypz and Srasz 
are much less than, and significantly different from, one in all three years with all three 
asset measures (they range from —0.18 to 0.22). Often they have the wrong sign. 
Additionally, when either FASA or BVPA is the asset measure Sano, Svzo, and Spgo have 
the wrong signs and are significantly different from both zero and one in all three 
years. 

The s, estimation results clearly suggest that the disclosed amounts (PLNA, ABO, 
VBO, and PBO) are the most consonant with investors’ assessments. Taken together, 
the measurement error variance and a, estimation results suggest that the accumulated 
benefit obligation (ABO) is the pension liability measure that appears to be most rele- 
vant and reliable to investors in most instances. While estimates of 8yg9 are closer to 
one than estimates of S4so in all three years (when PLNA is the pension asset measure), 
they are not statistically different in any year (t=1.04, t=0.56, and t=0.07 in 19987, 
1986, and 1985). This indicates that, statistically, they have the same magnitude bias but 
ABO exhibits less measurement error variance in 1987 and 1986 as reported above.” 
This is consistent with apparent preferences of accounting standard-setters; the accu- 
mulated benefit obligation triggers recording a liability under SFAS 87 when it exceeds 
the fair value of plan assets. The poor performance of book values is also consistent 
with the perceived inadequacies of APB 8” and provides evidence that market partici- 
pants have an “economic substance” rather than a “legal form” view of firm pension 
assets and liabilities. 

The consistently poor performance of the SFAS 87 amounts, FASA and FASL, rela- 
tive to the fair value of plan assets and the projected, accumulated, and vested benefit 
obligations indicates that SFAS 87 disclosures provide more relevant and reliable mea- 
sures of pension assets and liabilities than those to be included in the balance sheet. 
Apparently, the lack of symmetry accorded overfunded and underfunded plans, and 
other compromises in their calculation, have made FASA and FASL less appropriate 
than perhaps the FASB had anticipated. The performance of FASA and FASL may 
improve over time as they partially reflect the projected benefit obligation through net 
periodic pension cost. However, these numbers are small relative to unrecorded net 
plan assets for overfunded plans. Currently, many pension plans are overfunded; 79 
percent of the 1987 firms have either only overfunded plans (58 percent) or net over- 
funded plans (21 percent).“ Thus, recognizing only the full underfunded position and - 
not recognizing the full overfunded position is perhaps a larger problem for the sample 
years than in a time with the reverse situation. 


H Estimates of ga are statistically larger than s4so in 1987 and 1986 (t= 3.61 and t= 3.07) indicating PBO 
has larger magnitude bias than ABO, in addition to the larger measurement error variance noted above. In 
1985, Sreo and Sano are statistically indistinguishable (t=0.48). 

HD As previously noted, BVPA and BVPL are not equivalent to the APB 8 asset and liability amounts, but 


they are closely relate 
= The 1986 and 1985 percentages are 87 and 90 percent. L "IL. Lu 


V 


454 The Accounting Review, July 1991 


The accumulated benefit obligation’s having less measurement error than either the 
projected or vested benefit obligation has interesting interpretations. First, the rela- 
tively poor performance of vested benefits is consistent with investors’ adopting a 
going-concern perspective for pension liabilities. Second, accumulated benefits’ exhib- 
iting less measurement error than projected benefits is consistent with investors’ not 
viewing salary progression as part of the present pension obligation, or with the pro- 
jected benefits measure’s being too noisy to be reliable, or both. It is also consistent 
with the systematic understatement in the accumulated benefits measure as less prob- 
lematic than problems with the projected benefits measure. 


Accumulated Versus Projected Benefits 


To directly address whether future salary increases are considered by investors as 
part of the firm’s pension liability when they are included in the benefit formula, the 
sample is partitioned using two criteria. First, firms with a large ratio (greater than 1.2) 
of the projected to accumulated benefit obligation measures are identified. These firms 
with noticeably different PBO and ABO, should include those with pension benefit for- 
mulas depending noticeably on salary progression and thus are the focus of this analy- 
sis. Second, firms whose assumed salary progression rate includes only expected infla- 
tion are segregated from those whose rate includes a measure of expected productivity 
changes. As noted in section I, if the projected benefit obligation includes salary pro- 
gression only equal to expected inflation, the two measures are conceptually the same, 
and PBO is, in essence, a more internally consistent calculation of ABO. For partition- 
ing, it is assumed that firms using a salary progression rate between five and six per- 
cent are effectively including expected inflation only.** Firms using rates less than five 
or greater than six percent are assumed to be also including a productivity factor (gain 
or Jose) 28 

The results are presented in table 9. For the subsample with large PBO-to-ABO 
ratios and with salary progression rates including a productivity component, the pro- 
jected benefit obligation exhibits significantly less measurement error than the accu- 
mulated benefit obligation in 1987 and 1986. This is consistent with investors’ consider- 
ing future salary changes as part of the firm’s pension obligation when the pension 
benefit formula depends on future salaries. If PBO had not exhibited less measurement 
error on this subsample, it would not have been possible to distinguish whether PBO is 
too noisy (due, e.g., to the salary progression rate assumption) or whether investors do 
not implicitly consider expected future salary changes part of the pension liability. In 
either case, PBO would not have satisfied FASB criteria for measurement methods. In 
the subsample with large PBO-to-ABO ratios but with salary progression rates effec- 
tively including only inflation, the accumulated benefit obligation exhibits significantly 


33 The second partition requires an estimate of expected inflation over a period comparable to the average 
employees’ service lives. Estimates of average inflation from 1988 to 2000 by Data Resources (1988) and the 
Institute for the Future (1988) are approximately 5.5 percent. Since the partitioning objective is to identify 
firms with salary progression rates which include a productivity factor, and not just expected inflation, firms 
using salary progression rates different from 5.5 percent are of interest. Because 5.5 percent is an estimate, a 
range of five to six percent is assumed to include expected inflation only. Rates outside of this range are as- 
sumed to includs a productivity factor as well. 

2 Firms often select a portfolio of rates. Salary progression rate magnitude may not fully capture cross- 
sectional differences in expected future salary changes. Differences between the discount and salary 
progression rates were used in the second partition with qualitatively similar results. 
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Comparing o. and ož,» on Subsamples 
Estimates of Y, = — (O MVPL.urra + Ouvi wn — Orne: 


Firms With Ratio of PBO to ABO Greater Than 1.2 


SALRT = Inflation SALRT # Inflation 
PLi 1987 1986 1885 1987 1986 
ABO — 62.28 — 139.04 — 0.48 — 45.33 — 114.97 
(t) {—9.13) ° (— 286.38) (—2.14) (—8.77) {—7.79) 
PBO — 69.65 — 163.11 —0.51 — 43.18 — 109.14 
(t) (—9.39) (— 26.57) (—2.31) (-6.77) (—7.84) 
CH 136.5 680.1 11.2 42.1 43.2 
nobs 331 237 54 225 172 
Firms With Ratio of PBO to ABO Less Than 1.2 
SALRT = Inflation SALRT + Inflation 
1987 1986 l 1985 1987 1986 
ABO 459.16 386.41 0.15 — 35.59 8.36 
(i) (29.15) (18.41) (0.41) (-6.47) (3.91) 
PBO 448.99 386.59 0.11 — 33.44 7.61 
(t) (29.15) (18.43) (0.30) (-6.35) (4.01) 
yas 529.6 12.0 12.2 58.8 9.1 
nobs 274 171 29 212 91 
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* x7(1).08= 3.8, lass 5.0, x7(1) 98 = 7.9 
ABO= Accumulated benefit obligation. 
PBO=Projected benefit obligation. 
PLNA = Fair value of pension assets. 
BVA=Book value of nonpension assets. 
BVL=Book value of nonpension liabilities. 
SALRT=Assumed rate of compensation increase. 
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1985 


— 4.81 


- (—3.60) 


— 5.14 
(—4.02) 


99.1 
41 


1985 
103.34 
(21.17) 


119,53 
(20.44) 


85.8 
21 


less measurement error than the projected benefit obligation in all three years as was 
the case in the full sample. This is consistent with the projected benefit obligation’s 
being more noisy than the accumulated benefit obligation when expected future pro- 
ductivity is not included, even though ABO is internally inconsistent in its calculation. 
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Results for firms with low PBO-to-ABO ratios, partitioned on the basis of the salary 
progression rate, are also presented in table 9. Even with small salary progression ef- 
fects, in 1987 and 1986, the projected benefit obligation again exhibits significantly less 
measurement error than the accumulated benefit obligation for firms including a pro- 
ductivity component in the salary progression rate. For firms that include inflation 
only, the accumulated benefit obligation again exhibits significantly less measurement 
error than the projected benefit obligation in 1987 and 1985. The similarity of these 
results to those of the large-ratio firms strengthens the conclusion that future salaries 
are relevant in the pension obligation calculation but that the projected benefit obliga- 
tion is perceived as more noisy than the accumulated benefit obligation when expected 
future productivity is not a factor. 


IV. Limitations 


This study has potential limitations especially since drawing policy implications 
from statistical relations is difficult. Other factors (e.g., cost/benefit or social welfare 
considerations) may make it inappropriate to recognize the alternative measure having 
the least measurement error in the balance sheet. Another question concerns when a 
measurement error is considered small enough. 

The focus in this research is on relevance and reliability of the alternative measures 
for investors’ use. The definitions of relevance and reliability are complex and judg- 
mental, and may not be fully captured in their operationalization in the research de- 
sign. The predictive and feedback value aspects of relevance are operationalized by 
examining differences between the alternatives and the amounts implicitly assessed by 
investors. Reliability is operationalized by focusing on variances of these differences. 
Although the research design’s primary focus is on precision, magnitude bias also de- 
tracts from reliability. Therefore, evidence is also provided on bias in the pension lia- 
bility measures. The timeliness and neutrality aspects of relevance and reliability are 
not addressed. 

In addition, investors are only one group of users considered by the FASB. Further, 
the analysis assumes that securities’ prices reflect all information relevant to investors 
for assessing firm market value, although it is not assumed that their assessment reveals 
“truth.” Nonetheless, investors are a large class of financial statement users, and their 
assessment is interesting in its own right. The approach also provides evidence of po- 
tential interest to standard-setters in assessing measurement alternatives and has 
potential benefits for researchers seeking to relate reported accounting numbers to firm 
value. 


V. Summary and Concluding Remarks 


This study investigates which measures of pension assets and liabilities, disclosed 
under SFAS 87, most closely reflect those investors implicitly assess when they value a 
firm. The research addresses the FASB relevance and reliability criteria for evaluating 
measurement alternatives. It uses a cross-sectional model in a way that allows the re- 
searcher to exploit measurement error as an explicit feature in the research design to 
compare levels of measurement error rather than purge or ignore it. A model of differ- 
ences between market and book values is presented. When placed in a regression con- 
text, these differences become econometric measurement error. The difference model 


Barth—Relative Measurement Errors 457 


and the underlying valuation equation provide the framework for comparing the differ- 
ence term variances across the measurement alternatives investigated. Evidence is also 
presented on systematic magnitude bias in the pension liability alternatives. 

The fair value of plan assets and all three disclosed pension liability amounts ex- 
hibit significantly less measurement error than the amounts presently recognized in the 
balance sheet or to be recognized under SFAS 87. Among the three disclosed liability 
alternatives (the accumulated, vested, and projected benefit obligations), the accumu- 
lated benefit obligation is found to have significantly less measurement error than the 
others for the full sample. When the sample is partitioned to address specific aspects of 
the pension accounting controversy relating to comparison of the accumulated and 
projected benefits measures, the projected benefit obligation exhibits less measurement 
error than the accumulated benefit obligation on the subsample incorporating expected 
productivity changes. This is consistent with investors’ viewing future salary progres- 
sion as part of the firm’s pension liability. Noise in the projected benefits measure may 
be responsible for finding less measurement error in the accumulated benefit obligation 
when the two measures are essentially measuring the same obligation. The results sug- 
gest that SFAS 87 disclosures include financial accounting measures of pension assets 
and liabilities closer to those investors assess than the measures required to be recog- 
nized in the balance sheet. The relatively poor performance of the recognized amounts 
is consistent with investors’ viewing pension plan assets and liabilities as firm assets 
and liabilities (the “economic substance” view). It is also consistent with investors’ 
viewing compromises made in SFAS 87 as rendering the amounts to be recognized less 
relevant and reliable than disclosed measures. 


Appendix A 
Characterization of the Bias 


Coefficient estimate bias is seen by replacing the unobservable variables in equation DI with their 
equivalents expressed in terms of observed variables and their respective difference terms. These difference 
terms are assumed to be the regression measurement error. 


MVE=BVA+BVL+ PAI+ PLi+ , l (A1) 
where = — {Urra Uca + Urri + Ucn t Urut Uruh (A2) 


When equation (A1) is estimated as equation DL the OLS normal equations” are not assumed but im- 
posed, which results in consistent estimates of the ym. As is not independent of book values and pension 
alternatives, Ym will not consistently estimate a coefficient of 1. Expressions for differences between 1 and Ym 
are derived by using the difference term model. These expressions include the pension alternatives’ measure- 
ment error variances, o% and oi as parameters. Therefore, by imposing OLS normal equations together 
with a measurement error model, terms providing evidence on pension alternative measurement error can 
be estimated. 

The derived bias (inconsistency) can be expressed as ~ Exx” E(X’é). In general, this will not equal zero. 
Garber and Klepper (1980) evaluate it by considering the following auxiliary regression equations. 


BVA=8 +8 2BVL+8yPAI+8.4PLIi+W,, 

BVL=83+ Bn BVA +8 PAI + Bu PLi+ wa, 
PAI=Bx+83: BVA +8 BVL+834PLi+ ws, and 
PLi=8y+ 841 BVA+ 8a BVL + Bag PAI+ Wy. 


(A3) 


2 These are 1/N2%,X,¥,=0, where N is sample size and X, are independent variables. 
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Bu are the auxiliary regression coefficients, and w, are disturbance terms with variances oi. These regres- 
sions are projections of each observed variable onto the others. By construction, the w, have mean zero and 
are uncorrelated with the regressors. Garber and Klepper show:* 


—1/0}, Baloh Bunia, Bail Oh 
Bua/ge, lie, Balt, Baal Tn 

"el Biet, Baloh —1/02, Bolo |’ Da 
Bis/oz, Bis/O2, BulOmn —1/0%, 














E(X’é) equals: 
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— (OMYL wera + OMYL wua + Cura wun F Oair + Oke) = Yi (A5) 
— È OMVPA mira + OMVPA usr.) — Olea 8 Ys 
Se ( Deupt urra + OMVPL wun ) Was Onn: Ys 
The bias, B,,=1—y.,, can be expressed as: 
Se Bar Bar Ba 1 
=(( Ja ( ch ës G ye © NV 
i Ba Da Bia 1 
»=(( Tee a (Gh j») | 
(A8) 
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Terms in the B,, are estimated in two stages. First, the 8, and o2, are estimated in the auxiliary regres- 
sions. These estimates are used in the second stage, when Y,,...,Y, are estimated.” The Ym are not estimated 
. in the second stage; Y,,...,Y, are estimated directly. A system of equations comprising equation (2), setting 
Ym =~ Bm for each pension asset or lability alternative for a given liability or asset alternative is used. Y, 
and Y, are of primary interest, as they contain oi, and o2,,. Although oi, and o2,,, cannot be estimated 
directly, changes in Y, or Y, are attributed to changes in oi, or gi, As a result, Y, and Y, vary when differ- 
ent pension asset and liability alternatives are used in estimation. 

Note that oi, and o2,,, are greater than zero. Therefore, the higher the measurement error Toi, or GZ, Jk 
the smaller is Y, or Y,. Ranking Y, or Y, from highest to lowest is equivalent to ranking the related alterna- 
tives from lowest measurement error to highest. 

The equations that are jointly estimated have the following form. 


Yı 
gemenge 
MVE=+0+ j- [Eteran] . BVA 
Tw, wi On, Ow, 
Y2 
nh OO 
+ l- [Ee vata y, Zen | + BVL 
Owes Oy, Ow, On, 


3 Rao (1973) reports the sample analog of this. 

H The two-stage procedure affects reported standard errors. Approximations of asymptotic, heteroscedas- 
ticity-consistent second-stage standard errors for 1987 as outlined in Newey (1984) are similar to those used to 
construct reported test statistics, 
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Appendix B 
Additional Specification Checks 


Several additional specification checks were performed to address implications for the reported results 
of (1) including s, in the model of pension liabilities, (2) assuming the s, are cross-sectional constants, and (3) 
frequently rejecting the White (1980) test. The additional checks support the study's overall conclusions. 


Inclusion of s, in Model of Pension Liabilities 


In section II, the model of pension abilities is PLi=s,(MVPL+upz,,), where s, is included to recognize 
that different pension liability measures have, essentially by construction, different magnitudes. Conse- 
quently, a model without such a factor would be inappropriate. Excluding s, yields somewhat different re- 
sults. These results are presented in table B1. The ranking of pension liability alternatives is in exhibit B1 
{again, possibly implausible counterparts are marked by asterisks). 

The x? tests indicate all rank differences are significant except for ABO = VBO, FASL=PBO, and FASL = 
BVPL in 1986, when PLNA is the asset measure, and for FASL=BVPL in 1985, when FASA is the asset 
measure. 

Y3+Y, for [PLNA:ABO], [BVPA:BVPL], and [FASA:FASL] were compared by using t-tests as described 
in section III. As in the analyses that included s,, in all three years the pair [PLNA : ABO] shows significantly 
less measurement error than the other two pairs (t= 29.37, t= 22.69, and t= 11.56 in 1987, 1986, and 1985). In 
1987 and 1985, the pair [FASA:FASL] shows significantly less measurement error than the pair 
[BVPA:BVPL] (t=2.17 and t=4.82). In 1986, the pairs [FASA:FASL] and [BVPA:BVPL] are not significantly 
different from each other (t=0.19). 

The model of pension asset alternatives does not include a scale factor, as their measurement does not 
systematically result in magnitude differences. However, empirically, PLNA is much larger than BVPA or 
FASA. When the pension asset comparison was repeated with scaled asset measures, PLNA showed the least 
measurement error for all liability measures in all three years. 


Assumption that s; is Cross-sectional Constant 


Supplemental evidence suggests that assuming s, to be a cross-sectional constant is not unreasonable and 
has little effact on reported results: (1) In all three years there is no correlation between PBO/ABO (an 
estimate of 8s0/84s0, used in place of the unobservable s, as these two measures play an important role in the 
study) and several firm-specific variables. (2) No strong industry relation was found in a regression of PBO/ 
ABO on dummy variables representing groups of two-digit SIC codes. (3) Results are similar to those reported 
for all firms when the pension liability alternatives comparison was repeated using only firms in the middle 
80 percent of the PBO/ABO distribution. (4) Estimates of s, (see sec. IID have very small standard errors Jee. 
capt for some instances in 1985), which is consistant with an assumption of little cross-sectional variation. 


Rejection of White (1980) Test 


The White (1880) statistic tests model specification as well as heteroscedasticity. In light of the frequency 
of rejection, several specification checks were made: (1) The OLS regressions were estimated after taking 
steps to avoid rejection of the White test. These steps included scaling by number of shares outstanding, elim- 
inating outliers, and/or partitioning into size groups. The Y,,...,Y, were then reestimated with essentially 
qualitatively unchanged results. (2) Cookie D-statistics (Cook 1977) and DFBETAs (Belsley et al. 1980) were 
calculated to determine the effects of outliers on reported R?. Although the R? are somewhat smaller with 
elimination of observations with large statistic values, outliers do not drive high reported R?. (3) Condition 
indices (Belsley et al. 1980) indicate that collinearity is not serious. 
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Exhibit B1 
Pension Liability Measurement Error Ranking Lowest to Highest (1-5) 
Matched with Each Pension Asset Alternative 
No 8, in Model 

PLNA BVPA FASA 
Rank 1987 1986 1985 1987 1986 1985 1987 1986 1985 
1. PBO ABO PBO _ BVPL FASL BVPL BVPL FASL BVPL 
2. ABO VBO ABO FASL BVPL FASL FASL BVPL FASL 
3. VBO PBO VBO VBO* VBO* VBO* VBO* VBO* VBO* 
4. BVPL* FASL* BVPL* ABO* ABO* ABO* ABO* ABO* ABO* 
5. _FASL* BVPL* FASL* PBO* PBO* PBO* PBO* PBO* PBO* 


PBO= Projected benefit obligation. 
ABO= Accumulated benefit obligation. 
VBO=Vested benefit obligation. 
BVPL= Pension liability currently recognized. 
FASL= Pension liability to be recognized under SFAS 87. 
PLNA = Fair value of pension assets. 
BVPA= Pension asset currently recognized. 
FASA = Pension asset to be recognized under SFAS 87. 
* Possibly implausible counterpart. 
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SYNOPSIS: Decision makers refer to their long-term memory to test the 
implications of evidence about a current problem (Birnberg and Shields 
1984; Libby 1989). Given this reliance on long-term memory, biases in 
retrieval of previously encountered information may be an important source 
of decision error (Libby 1989), and differences in such biases may be one 
explanation for differences in auditor judgment performance across expe- 
rience levels. , : 

The present study adopts a schema-based framework to examine 
some differences in the knowledge structures and judgments of experi- 
enced and inexperienced auditors and the relationship between these 
knowledge structures and judgments. The study examines the recall of 
typical and atypical information by experienced and inexperienced auditors 
-within the context of a going-concern situation and then relates this 
measure of memory to the inferences and predictive judgments made by 
these auditors. l 

Three experiments were conducted. In experiment 1, auditors read a 
description of a company that the audit partner-in-charge had suggested 
may have a going-concern problem. The description consisted of items that 
are considered typical of a company with going-concern problems, atypical 
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items, and filler ttems. After an intervening period with a distractor task, all 
subjects were given a recall test, were asked to infer the likelihood of cer- 
tain previously unstated items being true, and to estimate the probability 
that the firm would fail within a year. 

The first of six main findings showed that experienced auditors re- 
called more atypical items than inexperienced auditors, but there were no 
differences in the number of typical items recalled. Second, experienced 
auditors recalled more atypical than typical items, whereas inexperienced 
auditors did not. Third, experienced auditors were more likely than in- 
experienced auditors to infer that previously unstated atypical items were 
true. Fourth, for both experienced and inexperienced auditors, the ratio of 
atypical to typical items recalled was positively correlated with the infer- 
ences made, and the inferences were negatively correlated with the pre- 
dictive judgments. This last correlation was much higher for the 
experienced than for the inexperienced auditors. Fifth, there was no direct 
relationship between recall and predictive judgments. Sixth, clustering of 
recall on the basis of atypical/typical items was significantly higher for 
experienced than for inexperienced auditors and was significantly cor- 
related with inferences for experienced auditors only. 

In experiments 2 and 3, we collected additional data to examine some 
validity threats related to the first experiment. In experiment 2, we exam- 
ined the relationship between recall and judgments, using audit managers 
who had worked on at least one audit with going-concern as an issue. We 
found results similar to those of experiment 1. In experiment 3, experienced 
auditors performed the recall and predictive judgments without the 
intervening inferences task. This provided a more direct test of the 
relationship between recall and predictive judgments. Again, no relation- 
ship was found. 


Key Words: Audit knowledge structures, Experience, Judgment, Typical- 
ity effect. 


Data Avallablility: 7he data upon which this paper is based may be ob- 
tained from the authors on request. 


REVIEW of the literature is outlined in the first section of this article. Testable 

hypotheses are developed in section IJ. The research methods used and the re- 

sults for experiment 1 are contained in sections III and IV, respectively. Experi- 
ments 2 and 3 are outlined in section V. The last section provides a discussion of the 
results. 


I. Prior Literature 


The importance of a better understanding of what allows experienced auditors to 
perform tasks that inexperienced auditors cannot has previously been noted (Ashton et 
al. 1988; Frederick and Libby 1986; Libby 1985). In developing this understanding, re- 
searchers have suggested that there is a need to examine the manner in which auditors 
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organize and access their knowledge (e.g., Birnberg and Shields 1984; Libby 1985, 1989; 
Libby and Frederick 1990; Waller and Felix 1984a, 1984b; Weber 1980) and how knowl- 
edge structure differences are related to differences in auditor judgments (Ashton et al. 
1988; Libby 1989). 

Waller and Felix (1984b) describe the means by which auditors search for and as- 
similate information stored in memory using a set of generic cognitive structures called 
“schemata.”! Schemata organize memory and play a fundamental role in all cognitive 
activity (e.g., remembering, predicting, explaining, formulating an opinion regarding a 
client’s financial reports). Waller and Felix state that the evidence supports the view 
that “schemata truly are the building blocks of cognition. They are the fundamental ele- 
ments upon which all information processing depends” (Rumelhart 1980, 33). 

Certain aspects of schemata are particularly relevant to this study. First, schemata 
enable individuals to group classes of observations together and order them in partic- 
ular ways (Taylor and Crocker 1981). Second, this organization varies with experience.’ 
For example, a number of studies have shown that more experienced individuals clus- 
ter domain-specific knowledge in more meaningful ways than less experienced indi- 
viduals (Adelson 1981; McKeithen et al. 1981; Weber 1980). Third, schemata determine 
the information to be encoded and retrieved from memory (Taylor and Crocker 1981). 
Fourth, schemata influence subsequent recall of information and provide a basis for in- 
ferences and predictions (Taylor and Crocker 1981). Fifth, many psychological theories 
implicitly assume a relationship between items recalled and judgments made (Hastie 
and Park 1986; Srull 1988). 

Ashton et al. (1988, 108) state that the role of schemata is particularly relevant in 
auditing “because the schemata developed by auditors through experience and prior 
knowledge of client situations may affect the manner in which the auditor perceives the 
evaluation of assertions and the need to accumulate and interpret evidence about these 
assertions.” They further suggest that understanding the nature of audit expertise may 
depend on examining schematic structuring attributes, such as differences in com- 
plexity of knowledge structures and recall ability among auditors with varying levels of 
experience. 

One phenomenon examined in the psychological research on schemata is how the 
comprehender first encodes and later retrieves information that is typical of a certain 
schema versus information that is atypical.’ Graesser and Nakamura (1982) outline four 


1 Schemata are generic knowledge structures that guide the comprehender’s interpretations, inferences, 
expectations, and attention. A schema (the singular form) is generic in that it is a summary of the components, 
attributes, and relationships that generally occur in specific exemplars (Graesser and Nakamura 1982). 

2 In much of the psychology literature and some accounting literature, the terms “experience” and 
“expertise” are used interchangeably. Although the two may be correlated in many situations, the two con- 
cepts are not necessarily equivalent. For example, an experienced auditor may not have domain-specific 
knowledge for a particular task. In our study, we did not measure expertise for our subjects and therefore refer 
to them only as “experienced” or “inexperienced” auditors. In our literature review and hypothesis develop- 
ment sections, we refer to “experts” and “novices” as labels for subjects in previous studies that had a test to 
distinguish “expertise.” For example, Fiske et al. (1983) used a political science test to divide their student sub- 
jects into experts and novices. For previous studies that did not test expertise, we use the terms “experienced” 
and “inexperienced” regardless of what terms those authors used. For example, Lurigio and Carroll (1985) split 
their probation officers on the basis of median experience and referred to them as experts and novices. 

3 In our experiment subjects rated various items according to how typical the item would be for a company 
that has a going-concern problem. We define a “typical” item as one likely to exist (occur) for a company with 
going-concern problems and an “atypical” item as one unlikely to exist (occur) for a company with going-con- 
cern problems. The six-point scale used for rating items ranged from “very atypical” to “very typical,” which is 
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models predicting that typicality and memory are related. Although the explanations 
for the findings vary across models, they all predict more accurate memory recall for 
information that is atypical rather than typical of the schema.‘ 

This schema-based psychological research has concentrated on shared knowledge 
structures, that is, subjects generally had a well-developed schema (e.g., Fiske and 
Kinder 1981; Lurigio and Carroll 1985). Examples of such well-developed schemata in- 
clude “going to a restaurant,” “washing a car,” and “visiting a doctor.” However, 
many tasks in the complex environment faced by the auditor require considerable expe- 
rience before a well-developed schema is acquired. Lurigio and Carroll suggest that 
variations in experience lead to important differences in schema content and applica- 
tion. They state that experienced individuals have more detailed and complete sche- 
mata and more sophisticated ways of using their knowledge than inexperienced indi- 
viduals. 

In adopting this schema-based framework to the recall and judgments of auditors, 
we have two main objectives, the first of which is to examine some differences in 
knowledge structures by examining the information recall by experienced and inexpe- 
rienced auditors. One characteristic that may vary with the level of experience is how 
auditors handle information that varies in typicality of some particular diagnosis. For 
example, Mautz and Sharaf (1961, 95) argue that the ability to recognize and diagnose 
important or significantly inconsistent information is presumably a key element of 
audit expertise. More recent protocol research by Bouwman (1982, 1984) shows that ex- 
perienced financial analysts focus on potential contradictions in their findings, 
whereas inexperienced analysts ignore these. By examining the recall of typical and 
atypical items by experienced and inexperienced auditors, we provide evidence on dif- 
ferences in their knowledge structures. Unlike the psychological research, this study 
examines schemata developed through work experience in a complex environment, 
and these schemata are expected to affect important decisions that auditors make 
(Ashton et al. 1988). 

The second objective of this research is to relate knowledge structures to auditor 
judgments. The relationship explored here has retrieval bias as an intervening variable. 


the same scale used in studies by Graegser et al. (1979), Graessor et al. (1980), Schmidt and Sherman (1984), and 
Smith and Graesser (1981). In a survey article of this literature, Graesser and Nakamura (1982) define a typical 
item as information that is both relevant and consistent to a central organizing schema, whereas atypical in- 
formation ts inconsistent or irrelevant to that central organizing schema. As all of our typical and atypical items 
are relevant to a going-concern judgment, our typical/atypical dichotomy can be equated with consistent/ 
inconsistent. This also fits with the use of terms consistent/inconsistent by Fiske et al. (1983), who asked 
subjects to rate each item according to “how likely or unlikely is it that the country would. . . .” and with the use 
of the terms “congruent” and “incongruent” by Hastie (1980). The terms “typical,” “consistent,” and ‘“con- 
gruent” are used interchangeably in comparisons of various models that explain the relationship between 
typicality and memory (¢.g., Graesser and Nakamura 1982 and Srull et al. 1985). 

4 The purpose here is not to examine the relative validity of the four models. Instead we examine the com- 
mon finding of the relationship between memory recall and typicality, how it varies with experience, and how 
these variations are related to subsequent judgments. It should be noted that the more accurate memory recall 
for atypical items found in studies by Smith and Graesser (1981), Graesser et al. (1978), and Schmidt and Sher- 
man (1984) results from lower intrusion rates {false recalls) for atypical, relative to typical, items. The propor- 
tion of presented typical and atypical items recalled did not differ for the shorter retention interval. However, 
in studies by Hastie and Kumar (1979), Srull (1981) and Srull et al. (1985), intrusion rates were very low and the 
more accurate memory recall resulted from a higher proportion of atypical than typical items being recalled. 
The common finding of more accurate memory recall is described as the “‘typicality effect.” 
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Given that the primary immediate goal of most experimental research in accounting is 
to increase understanding of decision making (Libby 1989), it is necessary to relate any 
differences in retrieval, resulting from differences in knowledge structures, to the judg- 
ments made. In addition, understanding the effects of possible recall differences on 
subsequent judgments helps to provide suggestions for further improvement of deci- 
sions, such as development and validation of expert systems in auditing (Ashton et al. 
1988; Pei and Reneau 1990). 


II. Hypotheses Development 


Graesser and Nakamura (1982) suggest that the schema ‘“‘copy-plus-tag” (SC+T) 
model provides the best explanation for the typicality findings in the psychology liter- 
ature. According to this model, once incoming information is comprehended, a specific 
memory trace is constructed by copying a subset-of the identified generic schema that 
best fits the incoming information. The subset of the schema that is copied into the 
memory trace consists of stated typical and unstated typical (inferred) items. Atypical 
items are simply tagged along the memory trace and remain a separate unit within the 
memory trace. Graesser and Nakamura predict that recall accuracy is better for 
atypical items than typical items of a given script. The rationale for this prediction is 
that, since the memory trace contains a pointer to the stated and inferred typical items 
as a single unit, it would be difficult to discriminate between stated and unstated items 
in a later recall test. In contrast, stated atypical items are tagged as functionally sepa- 
rate units in the memory trace. They would thus be easier to discriminate from the in- 
ferred atypical items that are not tagged in the memory trace and are not reactivated at 
the memory test. Graesser and Nakamura thus predict that false recalls (intrusions) for 
typical items will be higher than for atypical items, resulting in a lower overall recall 
accuracy for typical items. 

An alternative model is the “attention elaboration model” (Hastie 1980). This model 
assumes that more cognitive resources are allocated to information that deviates in 
some way from the schema. Since these deviations are difficult to relate to the schema, 
additional allocation of cognitive resources results in an increase in attention, rehearsal 
depth or breadth of conceptual elaboration, or in the number of associations with other 
items. Allocating more resources to the deviations from the schema results in richer, 
more durable, and more retrievable memory traces. 

Hastie suggests that the probability of recall of an item is a function of the number 
of links (associative paths) it has to other items. Further, because information that is 
atypical of a prior expectancy is relatively more difficult to comprehend than typical in- 
formation, it will be retained in the working memory for a relatively longer period of 
time. During this time, the person is assumed to retrieve additional.information from 
long-term memory in an effort to more fully comprehend the atypical information. As 
more previously stored information is retrieved and makes contact with the atypical in- 
formation in the working memory, additional associative paths develop. In the course 
of this internal elaborative processing, atypical information becomes extensively linked 
to other pieces of information, which makes them more retrievable and easier to recall 
than typical items.‘ 

* Graesser and Nakamura (1982) acknowledge that they did not completely eliminate the attention-elabo- 
ration explanation for the typicality effect, but they suggest that the evidence for this explanation is slim. Con- 


versely, Srull et al. (1985) find general support for the attention-elaboration explanation and suggest that it is 
not clear how the schema copy-plus-tag model could account for some of their results. 


Choo and Trotman—Knowledge Structure and Judgments 469 


The only study that has examined the recall of typical/atypical information across 
different levels of either expertise or experience is Fiske et al. (1983). In a political 
science setting, they compared how typical and atypical pieces of information were 
handled when knowledge was shared.* They argued that novices have less capacity to 
handle relevant information and suggested that, if novices are forced to focus on only a 
subset of relevant information, the least costly subset would be typical information 
since atypical information requires more elaboration to be integrated with existing 
knowledge. Fiske et al. found that experts tended to recall significantly more atypical 
than typical items and that novices recalled significantly more typical items. In addi- 
tion, they examined clustering of recall and found that experts’ inferences were cor- 
related with their clustering of recall. Fiske et al. suggested that these inferences are 
contingent upon the degree to which experts organize memory by a particular type of 
information. This correlation was not significant for novices in the study. 


Typicality-by-Experience Interaction 


Although going-concern decisions may not be as frequent as those of “washing a 
car,” “going to a party,” or “going to a restaurant,” it is expected that experienced 
auditors will have a reasonably well-developed schema for such situations. As noted in 
Libby (1985, 649), “auditors bring a wealth of task-related knowledge to the audit, 
acquired through years of training and experience.” In contrast, inexperienced audi- 
tors would be expected to have a less well-developed schema for such audit situations 
as a company with going-concern problems. Consequently, the effects of the typicality 
of information on recall performance of experienced and inexperienced auditors 
would be expected to differ. 

Although the atypical /typical distinction should become more pronounced with in- 
creased experience, the schema copy-plus-tag model does not provide clear insight into 
what these recall differences may be. Graesser and Nakamura (1982, 72) note that recall 
proportions are not likely to vary with typicality “in any simple elegant manner.” How- 
ever, with respect to the attention elaboration model, Alba and Hutchinson (1987) state 
that improved accuracy of recall for atypical over typical items should be particularly 
true for experts because they are more sensitive to incongruence. Fiske et al. (1983) 
argue that experts’ knowledge is more organized than novices’, with the result that ex- 
perts have greater capacity to handle relevant information. This allows experts to give 
more attention to atypical items that require additional processing to integrate with 
existing knowledge. Fiske et al. provided evidence consistent with these assertions; ex- 
perts recalled more atypical than typical items. 

The differences noted above have not been examined for auditors, but it is likely 
that inexperienced auditors have a less developed schema of a going-concern company 
and consequently are less knowledgeable about typical and atypical attributes of such a 
company. Previous studies (Bouwman 1984) indicate that inexperienced subjects may 
ignore contradictions in their findings and may not integrate all atypical items into 
their knowledge set. That is, they may not have the knowledge to integrate atypical 


6 Fiske et al. (1983) note that, although political experts and novices undoubtedly do differ in levels of 
knowledge content, some of their shared knowledge may be called consensual. By holding constant the consen- 
sual knowledge presented in the stimuli, they examined differences in expert-novice knowledge use. However, 
they do note that while you can attempt to formulate theories that emphasize either content or process knowl- 
edge, experimentally the two cannot be isolated given only the behavioral data base available to them. 
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items, or may not have the additional processing capacity to do so. As a result, we offer 
the following hypothesis. 


H1: Experienced auditors will recall more atypical than typical items, whereas 
there will be no difference in the number of typical and atypical items recalled 
by inexperienced auditors. 


Knowledge Structures and Inferences 


As schemata are used in drawing inferences when information is missing (Taylor 
and Crocker 1981), experienced and inexperienced auditors would be expected to make © 
different inferences if their schemata are at different stages of development or if they 
use them differently. 

Murphy and Wright (1984) asked four groups of individuals with differing levels of 
clinical experience to list the typical features of three categories of children’s psycho- 
logical disturbances. They found that category distinctiveness decreased as experience 
increased. The categories of experienced individuals contained many features shared 
by two or more categories, whereas the categories of inexperienced individuals con- 
tained virtually no overlapping attributes. Applying these results to the going-concern 
situation, we suggest that experienced auditors are aware that firms with going- 
concern problems have attributes of both viability and failure. As a result, such auditors 
may be less likely to reject unstated items indicative of viability. Less experienced audi- 
tors presented with the partner’s suggestion of a going-concern problem are more likely 
to reject unstated viable items because they have more discrete (typical) mental repre- 
sentations (De, more black and white distinctions). Typical items may be incorporated 
into their knowledge base easily and, therefore, rated as true. However, this would not 
be the case for atypical items, as these cannot be easily incorporated into their knowl- 
edge base. In addition, support for hypothesis H1 would suggest additional availability 
of atypical items for experienced auditors, which is expected to affect their inferences 
of the likelihood of unstated atypical items being true. 

On the basis of the above, the following hypothesis is suggested: 


H2: Compared with inexperienced auditors, experienced auditors are less likely to 
infer previously unstated atypical items as being false for this client. For pre- 
viously unstated typical items, there will be no difference in the inferences of 
inexperienced and experienced auditors. 


Knowledge Structures and Predictive Judgments 


The availability model is one of the models that has been offered to explain a rela- 
tionship between memory and judgment’ This model suggests that biased recall results 
in biased input to judgment. For example, the recall of more instances of a certain event 


7 Hastie and Park (1986) outline four other theoretical processing models that attempt to explain the rele- 
tionship between memory and judgment. They distinguish between memory-based judgments and on-line 
judgments. For memory-based judgments, they suggest a direct relationship between memory for the evidence 
and the judgment made. However, for on-line judgments (where the subject is forming the judgment “on-line” 
as evidence is encountered), they suggest it is not possible to unequivocally predict the relationship that will be 
obtained between memory and judgment measures. Our subjects were not asked to make a judgment at the 
stage when the items were presented. However, given the salience of the context, it is possible that one was 
made. In this case, availability could work much the same way if the judgment is used as a retrieval cue. 
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would result in a higher judged probability of occurrence of the event. Similarly, the re- 
call of more arguments in favor of a position would result in a higher evaluation of that 
position (Hastie and Park 1986; Hoch 1984). With respect to a client with a possible 
going-concern problem, we would expect that the more those items typical of failure 
are recalled, relative to atypical items, the higher the probability estimate of company 
failure. Accordingly, the following relationship is tested: 


H3: There will be a significant relationship between auditors’ recall and predictive 
judgments. 


Ill. Research Methods 
Overview of Experiment 1 


Experienced and inexperienced auditors read a description of the XYZ Company, 
which the audit partmer-in-charge had suggested may have a going-concern problem (to 
invoke the schema of a company with going-concern problems).* The description con- 
sisted of typical items, atypical items, and filler items. After an intervening period with 
a distractor task, all subjects were given a recall test? and then made inferences of the 
likelihood that certain previously unstated items were true. Finally, subjects estimated 
the probability of company failure. Recall of typical and atypical items was then related 
to the inferences made and to the probability estimates. 


Subjects 


Subjects at two levels of experience were recruited from the same international 
firm. The experienced auditors were at the senior/supervisor level and had at least 
three years’ auditing experience (average experience=4.4 years). The inexperienced 
auditors were recruits, and each had less than six months’ practical experience (aver- 
age experience=3.3 months). The subjects completed the tasks during a staff training 
session. 

The experienced auditors were selected on the basis of discussions with partners 
who indicated that these subjects would have sufficient experience to perform the task. 
Although it is the partners who are responsible for an ultimate going-concern decision, 
seniors/supervisors are heavily involved in assessments of the likelihood of company 
failure as part of their analytical review work. In addition, in Australia, there is the re- 
quirement to audit the Directors’ Statement, which includes a statement that there are 
reasonable grounds to believe that the company will be able to pay its debts as and 


* The going-concern task is chosen for this study because, first, it is an unstructured decision task. Abdol- 
mohammadi and Wright (1987) suggested that the differences between experienced and inexperienced 
auditors could be more appropriately detected for an unstructured task than for a structured task. Second, past 
research and official pronouncements have provided ample examples of typical and atypical information with 
respect to a going-concern problem (AICPA 1981; Campisi and Trotman 1985; Kida 1984; Levitan and Knoblett 
1985; Mutchler 1984; Peel and Peel 1987). 

° Many of the psychology studies that consider the typicality effect conduct both recognition and recall 
tests. The argument for the application of recall and/or recognition tests in research of this nature is still very 
much a contentious issue (see, e.g., Singh et al. 1988). In practice, when an auditor is faced with new evidence, 
he or she will need to recall previously examined information to evaluate this subsequent evidence (Moeckel 
and Plumlee 1989). This procedure will be more similar to a recall test than a recognition test. In addition, a 
recall test allows us to assess memory organization, which a recognition test does not. Consequently, the pres- 
ent study is restricted to a recall experiment. 
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when they fall due. Thus, on every audit, specific consideration has to be given to the 
likelihood of company failure, and it is the task of the seniors/supervisors to raise any 
problems for discussion with the relevant managers and partners. 


Development of Research Instrument 


The measurement of typicality used in the study was taken from data collected pre- 
viously by Trotman and Zimmer (1983). Eight audit managers from international firms, 
all with at least seven years’ experience, rated 44 attributes according to how typical 
each would be for a company with a going-concern problem. A six-point scale ranging 
from “very atypical” to “very typical” was used, the same as the one used by Graesser 
et al. (1980) and Smith and Graesser (1981). The mean correlation index of 0.67 between 
the ratings of the eight managers reflects a reasonable level of agreement among their 
ratings. 

The mean score for each of the items was used to determine whether the item was 
typical. The ten items with the lowest mean scores (between 1.63 and 2.13) were selected 
to represent atypical items in this study. The ten items with the highest mean scores 
(between 4.75 and 5.63) were selected to represent typical items. Ten other items with 
scores between 2.63 and 4.38, judged by subjects to be neither very typical nor very 
atypical, were selected as filler items.” A listing of these attributes is given in exhibit 1. 

In accordance with prior psychology studies, two versions of the going-concern 
script were prepared." Each version contained a total of 20 statements. Five of the ten 
typical items and five of the ten atypical items were presented in version A. The re- 
maining five typical and five atypical items were presented in version B, and the ten 
common filler items appeared in both versions. Within each version, the ten test state- 
ments and ten filler statements were arranged in an order that distributed the typical 
and atypical statements evenly and that also represented a coherent sequence (Schmidt 
and Sherman 1984). Both versions A and B start and end with filler items in order to 
reduce any primacy and/or recency effect on subjects’ scores. Subsequent analysis 
revealed no experimental version (A or B) effect. l 


Task and Procedures 


The experiment consisted of five tasks. In part one, subjects were assigned at ran- 
dom to read either version A or version B of the description of XYZ Company. The in- 
troduction to both versions stated that the partner-in-charge had suggested that the 
company may have a going-concern problem. Subjects were asked to read carefully be- 
cause they would later be asked some questions about the firm. No mention was made 
of the subsequent recall test. After reading the company description for five minutes, 
subjects were given a 20-minute filler (distractor) task. The filler task was an unrelated 
audit judgment task similar to that in Ashton (1974). The third task was a recall test to 


‘© The mean perceived typicality of the typical, atypical, and filler items is 5.08, 3.39, and 1.93, respectively. 
These differences are significant (typical vs. atypical items, t=21.33, p<0.001; typical vs. filler items, t=5.70, 
p<0.001; atypical vs. filler items, t=9.00, p<0.001). 

1 The rationale for having two versions of each script is that this design feature“... permits an assessment 
of sophisticated guessing for each test action (statement). In a recall task, a subject may recall a test action 
(statement) that was not presented. .. . An estimate of guessing is essential for an assessment of what is remem- 
bered about a passage. ... These design and counterbalancing constraints have been imposed in all the studies 
conducted in our laboratory” (Graesser and Nakamura 1982, 69). As there were very few intrusions in this 
study, the question of sophisticated guessing did not become an issue. 


Choo and Trotman—Knowledge Structure and Judgments 473 


Exhibit 1 
Lists of Typical, Atypical, and Filler Items 


(i) A list of the ten typical items selected for this study. 

. The market value of a number of the company’s properties acquired in 1984 is now markedly 
below their book value. (5.625) 

. Most of the company’s growth has been debt-financed, resulting in high leverage. {5.500} 

. Sales during the last five years have fallen substantially. (5.250) 

. Inventories are turning over much more slowly than in the past. (5.250) 

Executive turnover is high. (5.000) 

Last year’s trading conditions were affected by a downturn in the building industry. (4.875) 

. Additional borrowing is severely restricted by existing borrowing restrictions. (4.875) 

. Profits have declined over the past five years. (4.875) 

. Cash budgets indicate the company will not exceed its overdraft limit, but some assumptions, 
particularly the sales forecast, appear tenuous. (4.750) 

. The windows division has been selling some stock below normal margins in order to increase 
cash flow. (4.750) 


DU A list of the ten atypical items selected for this study. 
. All preference dividends have been paid on time. (1.625) 
. Dividends have been paid to shareholders continuously since 1971. (1.750) 
. The company has been able to take advantage of suppliers’ discounts. (1.875) 
Profits and cash dividends from an associated company have increased over the last three years. 
(1.875) 
5. The company has a good credit rating with a well-known rating bureau. (1.875) 
6. Dividends per share have been steady since 1981. (2.000) 
7. The accounts receivable turnover rate is good compared to industry average. (2.000) 
8. The only past audit qualification has been due to failure to depreciate buildings (2.000) 
9 
10 


bech 
O oeA0oobeob P 


M a N a 


. The company has a modern plant that will not need to be replaced in the near future. (2.125) 
. There have been no past operating losses over the past five years. (2.125) 


(iii) A list of the ten filler (neutral) items selected for this study. 

. Competition has been reduced due to an increase of tariffs on imports. (2.625) 

. There have been no industrial strikes within the company during the last 12 months. (2.750) 

. The company has made no changes in accounting techniques in the last two years. (2.875) 

The company was incorporated in 1967 to manufacture and supply building materials. (3.000) 

Total assets exceed total liabilities. (3.000) 

The company founder and managing director died in 1982. (3.375) 

. A recent land acquisition was financed by the sale of listed share investments, which had been 
held for 18 years. (3.500) 

. The new general manager was experienced in the motor vehicle industry; he had initially no 
experience in the building materials industry. (4.125) 

. The company acquired a number of properties in 1984. (4.250) 

. There was a technical default on a loan agreement from a trading bank due to the current ratio 
limit being exceeded. (4,375) 
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Note: Numbers in parentheses are the typicality scores. 


be completed in ten minutes. Subjects were instructed to list all the information that 
they could remember about the XYZ Company. They were told that even if they had 
doubts about the completeness or importance of their statements, they should include 
them in the list. 

Part four consisted of one of two versions of a likelihood inference task. Those who 
had received version A (version B) of the going-concern script earlier were given a list 
of the five typical and five atypical items from version B (version A). Subjects were 
given the following instructions: 


Listed below are ten pieces of information that were not included in the description of 
XYZ given to you. Based on your knowledge of the company, please INFER how likely it 
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Table 1 
Summary of Experimental Tasks and Procedures 
(Experiment 1) 
Version A Version B 
Subjects Subjects 
Read Read 
Part One Keg Varon A of the version B of the 
going-concern going-concern 
script script 
Time: 5 mins. | | 
Perform Perform 
Part Two Saanaa internal control internal control 
(filler) task (filler) task 
Time: 20 mins. | | 
Perform Perform 
Part Three Se recall recall 
task task 
Time: 10 mins, | | 
Perform Perform 
Part Four i inference task inference task 
on version B items on version A items 
Part Five ne ae Prediction Prediction 
of failure of failure 
Time: 5 mins. 
(for Parts Four and Five) ! 
Number of 
experienced auditors 11 10 
Number of 
inexperienced auditors 11 10 


is that each of the ten pieces of information are true by circling the appropriate number 
on the nine-point scale where 1 represents “extremely unlikely” and 9 represents ‘‘ex- 
tremely likely.” 


In part five, subjects were asked to estimate the probability that XYZ Company 
would fail within one year. They used an 11-point scale anchored on 0, “no chance of 
failure,” and 1, “certainty of failure.” 

Table 1 summarizes the order and timing of the particular tasks. Each part of the in- 
strument was collected from all subjects before the next part was started. 


Dependent Variables 
The dependent variable for hypothesis H1 was the recall proportions” obtained 
u Studies by Graesser and colleagues also included an “intrusion proportion” (the likelihood of recalling a 


typical [atypical] test statement from the alternate version of the script) and a derived memory score, a measure 
of memory with scores expressing recall proportion corrected for intrusion errors. It was initially thought that 
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from part three of the experiment. There were five typical (atypical) test statements in 
each version of the script; if a subject correctly recalled three of them, then his or her 
recall proportion for typical (atypical) statements is 0.6. Following the approach in the 
psychology literature, we scored the recalls by “a lenient-gist criterion” (e.g., Fiske et 
al. 1983; Graessser et al. 1980). With this criterion, recall of a given item does not need 
to be verbatim; it is sufficient that the meaning is equivalent to that of the original text. 
One of the researchers and a research assistant with eight years of auditing experience 
independently scored each subject’s recall proportions. The correlation between the 
two raters was 0.96, and the Kappa Coefficient (Cohen 1960) EES a non-chance 
agreement of 0.99 (p<0.001). 

The dependent variable for hypothesis H2 was the subjects’ mean inference scores 
on both typical and atypical items from part four of the experiment. To test hypothesis 
H3, the ratio of recall proportion for atypical to typical items was correlated with the 
subject’s prediction of the likelihood of company failure (part five). 


IV. Results 


First, a 2x2 ANOVA was used” to test hypothesis H1. Experience was a between- 
subjects factor with two levels (experienced and inexperienced), and typicality was a 
within-subjects factor with two levels (typical items and atypical items). The dependent 
variable was recall proportion. 

The results of this analysis are shown in table 2. The significant experience-by- 
typicality interaction effect (F=23.93, p<0.001) is consistent with hypothesis Hi. A 
diagrammatic representation of the interaction effect is shown in figure 1. 

Table 3 provides multiple comparisons of the simple main effects to identify the 
sources of differences. An a posteriori Tukey’s HSD test (Hays 1981) was used. The 
main results, shown in table 3, are that (1) although experienced auditors recalled a 
significantly greater number of atypical items than inexperienced auditors (0.419, 
p<0.05), they did not recall a significantly greater number of typical items (0.067, ns), 
and (2) although experienced auditors recalled more atypical than typical items (0.219, 
p<0.10), inexperienced auditors did not (0.133, ns).' 





these would be important; however, they were negligible in this study and therefore no further analysis was 
made on these measures. It is also noted that low intrusion rates are not unusual in other areas of psychological 
research (e.g., Shedler and Manis 1986). 

H Given the three dependent variables, an overall MANOVA was calculated with recall proportion, infer- 
ences, and prediction of failure as the dependent variables and experience as the independent variable. Expe- 
rience could take two levels: experienced auditors vs. inexperienced auditors. Recall proportion and infer- 
ences in this case had to be calculated for each subject on both typical and atypical items together. The overall 
MANOVA was significant (Wilke’s Lambda = 0.656, df=2, approximate F=6.63, p<0.001). Univariate Fiesta 
were subsequently used to test each of the three hypotheses. There are a number of alternative ways to conduct 
follow-up analyses. One problem of the univariate F-tests is that the experimental error rate may not be con- 
trolled. Bray and Maxwell (1985) suggest that the experimentwise alpha can be controlled by using a Bonfer- 
roni procedure. This adjustment was used with no changes in any of our conclusions. 

i4 We had previously collected another set of data using parts one, two, and three of the research instru- 
ments in this study. The subjects were 21 inexperienced and 20 experienced auditors from two international 
firms (not the one used in the present study). The results for hypothesis H1 were consistent with figure 1. The 
only minor difference was that the difference between atypical and typical items for experienced auditors was 
slightly larger, significant at 0.05 instead of 0.10. 
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Table 2 
_ A2x2 ANOVA on Recall! Proportions by Experience and Typicality 
{Experiment 1) 
Sum of Mean Significance 
Source of Variation Squares df Squares F of F 
Between Subjects TP 
Experience 1.24 1 1.24 16.40 0.000 
Error 3.02 40 0.08 
Within Subjects 
Typicality 0.14 1 0.14 5.64 0.021 
Experience by typicality 0.65 1 0.65 23.93 0.000 
Error 1.09 40 0.03 
Total 6.14 83 


Recall proportion is measured as the ratio of the proportion of atypical to typical items recalled. 





Table 3 
Multiple Comparisons of Simple Main Effects on Recall Proportions 
(Experiment 1) 
: f MS orror 
Tukey’s HSD test: critical value qo.os, ss = 0.234 
n 
Computed 

Comparisons of Mean Recall Proportions Value p-value 
Experienced vs. inexperienced on typical items: 0.067 ns 
Experienced vs. inexperienced on atypical items: 0.419 <0.05 
Typical vs. atypical items by experienced: 0.219 £0.10 
Typical vs. atypical items by inexperienced: 0.133 ns 


The results of testing hypothesis H2 with a 2 x2 ANOVA on the inferences by expe- 
rience and typicality are shown in table 4. The significant experience-by-typicality in- 
teraction effect on subjects’ inferences (F =9.78, p=0.003) is consistent'® with hypoth- 
esis H2. A diagrammatic representation of the interaction effects is shown in figure 2. 

Given the significant interaction effect, multiple comparisons of the simple main 
effects were again calculated to identify the source of differences. Table 5 provides the 
results of the comparisons based on Tukey’s HSD test. The main results are (1) that both 
` experienced (1.247, p<0.05) and inexperienced (2.457, p<0.05) auditors inferred that 
previously unstated typical items were more likely to be true than previously unstated 


** The assumptions of the model were not violated. In addition, the ANOVA was rerun with p (recall) as a 
covariate (the covariate was not significant, p=0.28}. Although the main effects for experience and typicality 
remained significant (p<0.05), the interaction effect became marginally significant (F= 3.32, p=0.076). 
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Figure 1 


A Diagrammatic Representation of the Experience-by-Typicality 
Interaction Effect on Recall Proportion | 


1.0 








Mean 
recall 0.5 
proportion 
(s.d) - 
0.0 
Typical Atypical 
Table 4 
A 2x2 ANOVA on Inferences by Experience and Typicality 
(Experiment 1) 
Sum of Mean Significance 
Source of Variation Squares df Squares F of F 
Between Subjects 
Experience 10.57 1 10.57 9.75 0.003 
Error 43.37 40 1.08 
Within Subjects 
Typicality 72.08 1 72.06 91.70 0.000 
Experience by typicality 7.68 1 7.68 9.78 0.003 
Error 31.40 40 0.79 
Total 165.08 83 


atypical items,** (2) that in comparison to inexperienced auditors, experienced auditors 
inferred that previously unstated atypical items were more likely to be true (1.315, 
p<0.05), and (3) that for previously unstated typical items, there were no differences be- 
tween the inferences of experienced and inexperienced auditors (0.105, ns). 

In the development of hypotheses H1 and H2, it was argued that experienced audi- 
tors have additional processing capacity because their knowledge is more tightly orga- 
nized. One measure of knowledge organization is the level of clustering that can be 
assessed for each subject by calculating an “adjusted ratio of clustering” or ARC index 


‘6 Fiske et al. (1983) found that their experts made inferences slightly in reverse of the knowledge set given. 
They stated that this result was a surprise and interpreted it to mean that their experts took typical items for 
granted. It is suggested that our experienced subjects would not take these items for granted because of the 
audit risk of ignoring items indicating failure. 
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Figure 2 


A Diagrammatic Representation of the Experience-by-Typicality 
Interaction Effect on Inferences 


10.0 





Mean 
inference 5.0 
(s.d.) 
0.0 
Typical Atypical 
Table 5 
Multiple Comparisons of Simple Main Effects on Inferences 
(Experiment 1} 
soak MS error 
Tukey’s HSD test: critical value da en a | ———- = 0.8591 
n 
Computed 
Comparisons of Mean Inference Value p-value 
Experienced vs. inexperienced on typical items: 0.105 ns 
Experienced vs. inexperienced on atypical items: 1.315 <0.05 
Typical vs. atypical items by experienced: 1.247 <0.05 
Typical vs. atypical items by inexperienced: 2.457 <0.05 


(Libby 1985; Roenker et al. 1971; Weber 1980).'? In our study, the ARC score measures a 
subject’s tendency to list typical items together and atypical items together in the recall 
protocol. The ARC score varies between —1 and +1, where an ARC of 0 indicates 
chance clustering, a value of +1 indicates perfect clustering, and a value of —1 
indicates no clustering. The mean ARC score for experienced auditors (0.557) was sig- 
nificantly higher (p<0.001) than for inexperienced auditors {—0.034).'* In addition, it 


Y Roenker et al. (1971, 48) suggest that this measure “provides an uncontaminated measure of the relative 
amount of clustering in free recall, thereby allowing for comparisons between and within subjects and/or ex- 
periments.” The formula for calculating the index is given in Weber (1980). Weber states that the measure is in- 
variant to the number of categories the subject recalls, the distribution of total items recalled across categories, 
and the total number of items recalled. 

18 The ARC scores were recalculated with the filler items included and divided between failure and viabil- 
ity items on the basis of the pretest. The mean ARC scores for experienced auditors (0.511) remained signifi- 
cantly higher than for inexperienced auditors (— 0.017) (t=4.12, p<0.01). In addition, it should be noted that 
the other set of data referred to in footnote 14 had levels of clustering almost identical to those of our main re- 
sults (experienced = 0.543; inexperienced = — 0.047). 
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Table 6 
Correlation Matrix for Inexperienced Auditors 
(Experiment 1) 
Predictive 
Recall Inferences Judgment 
Inferences 0.462* 
Predictive judgment 0.047 — 0.205 
Clustering i 0.219 0.181 0.275 


* p<0.05. 
Recall = ratio of recall proportion for atypical items to typical items. 
Inferences= ratio of inferred likelihood of unstated atypical items to typical items. 
Predictive judgment = probability of company failure. 
Clustering=ARC score. 


Table 7 
Correlation Matrix for Experienced Auditors 
(Experiment 1) 
Predictive 
Recall Inferences Judgment 
Inferences 0.465* 
Predictive judgment —0.455* —0.751"* 
Clustering 0.111 0.527* 0.200 
* p<0.05. 
** p<0.001. 


Recall = ratio of recall proportion for atypical items to typical items. 
l Inferences = ratio of inferred likelihood of unstated atypical items to typical items. 
Predictive judgment= probability of company failure. 
Clustering = ARC score. 


was found that the higher the level of clustering, the higher the ratio of inferred atypical 
to typical items (r=0.527). This relationship was not significant for inexperienced audi- 
tors (r=0.181). l 

Hypothesis H3 stated that there would be a significant relationship between the 
auditors’ predictive judgments on the probability of company failure and the type of 
items recalled. Tables 6 and 7 provide the correlation matrices of the variables exam- 
ined. For experienced auditors, the predictive judgments were significantly correlated 
with recall (r= —0.455, p<0.05). That is, the greater the ratio of atypical to typical 
items, the lower the probability judgment. The same relationship was not significant for 
the inexperienced auditors (r=0.047, ns). These results are consistent with hypothesis 
H3 for experienced auditors only. The above relationships are depicted in figure 3 and 
in tables 6 and 7. 

From figure 3, it is not clear whether there is a direct relationship between recall 
and the predictive judgments, or whether inference is an intervening variable in this 
relationship. With partial correlation analysis, the correlation for experienced auditors 
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Figure 3 
Correlation Between Recall, Inferences, and Predictive Judgments 


-0.205 


(-0.751) ** 
[-0.817] ** 


Predictive 
Judgments 





No enclosure = inexperienced. 
( 


= experienced. 
{ ] = managers. 
* p<0.05. 
** p< 0.001. 


between recall and predictions conditional on inferences is —0.181 (p=0.44), which 
suggests that it is inferences that are associated with both recall and predictive judg- 
ments. 

We collected additional data (reported below as additional experiments) to test a 
number of alternative explanations that may explain the lack of a direct relationship be- 
tween recall proportions and predictive judgments. First, as noted by Davis and Solo- 
mon (1989, 161), “‘task-specific experience is more important than tenure as an accoun- 
tant.” Since we did not have specific information on each subject’s task-specific 
experience, it is possible that lack of experience with the task could account for the 
non-significant association. This possibility is tested in experiment 2. 

Second, given the strength of the relationship between recall and both inferences 
and the predictive judgments, it may be that recall affected both inferences and predic- 
tive judgments in the same manner. That is, as both are concerned with prospects of 
failure and will be affected by the schema invoked and the encoding of the new infor- 
mation, recall may have had a similar effect on both. Therefore, by partialling out the 
correlation between recall and inferences, we may also be eliminating the correlation 
between recall and predictive judgments. 

Third, we suggest that the natural tendency of cognitive processes would be to 
recall, infer missing information, and then make a predictive judgment. Asking subjects 
in experiment 1 to write down their recall and inferences may have affected the natural 
cognitive process usual with reading about a problem and making a predictive judg- 
ment, and may have decreased the correlation between memory recall and predictive. 
judgments. These last two alternatives are examined in experiment 3. 
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Table 8 
Correlation Matrix for Managers 
(Experiment 2) 
` Predictive 

Recall Inferences Judgment 
Inferences 0.613* 
Predictive judgment ~ 0.510" —0,817** 
Clustering 0.443 0.478* 0.284 

* p<0.05. 
** p<0.001. 


Recall!=ratio of recall proportion for atypical items to typical items. 
Inferences=ratio of inferred likelihood of unstated atypical items to typical items. 
Predictive judgment= probability of company failure. 
Clustering = ARC score. 


V. Additional Experiments 
Experiment 2 


To examine the efficacy of experience on this result, we replicated the experiment 
using 16 experienced managers who were selected on the basis of having worked on at 
least one audit with going-concern as an issue. These managers are responsible for 
determining the likelihood of their clients having solvency problems. Partners from 
two international firms (other than the firm used in experiment 1) were contacted to 
select subjects on the above basis. One firm provided nine managers, who completed 
the experiment at one of two sittings in one of the firm’s training rooms. The other firm 
provided seven managers, who completed the experiment on an individual basis in the 
presence of one of the researchers. The average experience level of the 16 managers 
was 8.7 years. The subjects were all given version B of the research instrument of ex- 
periment 1, and the same procedures of that experiment were followed.’ 

The correlations for the managers are included in table 8 and figure 3. The recall 
factor is significantly correlated with predictions (r= —0.510, p<0.05) and with infer- 
ences (r=0.513, p<0.05). Inferences and predictions are significantly correlated 
(r= —0.817, p<0.001). The partial correlation between recall and predictions condi- 
tional on inferences is — 0.262 (p=0.33). These correlations are all in the same direc- 
tion and show the same basic results as those in experiment 1. 


‘9 This data for the managers was collected more than three years after the original data was collected. Over 
the three-year period, there were large changes in the audit methodologies of the firms concerned. More im- 
portantly, there was a significant decline in the Australian economy during this period, with many company 
failures, including some of the largest companies. The extensive press space given to these company failures 
included criticisms of the auditing profession. As a result, failure items receive more attention in the present 
economic climate, and this may suppress the effects already found. Therefore, there is a major validity threat in 
comparing these variables with those collected more than three years ago. However, the direct effect of these 
differences on the correlations is difficult to predict, and they are reported here to illustrate that there appears 
to be no major differences in the relationship between recall and judgments between the seniors/ supervisors 
and the managers. 
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Experiment 3 


In experiment 3, we investigated the association between recall and predictive 
judgments without the intervening task of making explicit inferences about the client.”° 
The research instrument and procedures were the same as in experiment 1 with the ex- 
ception that subjects were not given part four. In summary, subjects read the informa- 
tion, were given a distractor task, completed a recall test, and then made a predictive 
judgment. The experiment was conducted during a staff training program with 18 
supervisors from one of the firms used in experiment 2. They had an average of 4.76 
years of audit experience. Like the subjects in experiment 1, these supervisors are re- 
sponsible for raising any going-concern problems on every audit. 

The correlation between recall and predictive judgments was 0.14 (p=0.58). This 
result is consistent with the results obtained with partial correlations in the preceding 
two experiments, but is not consistent with any of the three alternative explanations for 
the lack of a significant association between recall and predictive judgments. The re- 
sults of the three experiments show that, in our study, there is no direct association 
between recall and predictive judgments. 


VI. Discussion 


With respect to knowledge structures, our study found differences between expe- 
rienced and inexperienced auditors in the amounts, type, and clustering of items 
recalled. We also found differences in the inferences made by experienced and inexpe- 
rienced auditors. For experienced auditors, these inferences were significantly corre- 
lated with predictive judgments and the clustering of recall. This latter relationship im- 
plies that experienced auditors’ inferences are contingent upon the degree to which 
they organize memory by a particular type of information. Finally, for both experienced 
and inexperienced auditors we found no direct relationship between the predictive 
judgments made and the items recalled. 

There are a number of aspects of our research design that should be noted by re- 
searchers who intend to further investigate the relationship between memory and audi- 
tor judgments. First, even though subjects were not asked to make a predictive judg- 
ment before making the recall, it is suggested that, given the nature of the context, 
subjects most likely formed their judgments “on-line” as evidence was encountered 
and before the recall exercise. Previous psychological and marketing research (Hastie 
and Park 1986; Lichtenstein and Srull 1985) has also shown that, for on-line judgments, 
there is often no direct relationship to memory. However, for memory-based judg- 
ments, the correlation between recall and judgments was consistently higher. Although 
many audit judgments will be made on-line in practice, some will be memory-based.” 

Second, the subjects in this study recalled items that had been previously presented 
to them. Using self-generated reasons in an accounting context, Moser (1989) found a 
relationship between subjects’ reasons for and against thinking a company’s profit 


= This data was collected approximately six months after the data in experiment 2. The state of the Austre- 
lian economy further weakened in this period, so the validity threat noted in footnote 19 also applies here. 

7 With on-line judgments the subject is forming the judgment as evidence is encountered. For memory 
judgments, the subject must rely on retrieval of evidence from long-term memory to make a judgment. An ex- 
ample of a memory-based judgment in auditing would be a senior being asked to make a judgment in the review 
process that he or she did not previously consider when collecting and evaluating evidence. 
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would increase by at least five percent and their subsequent judgment of the likelihood 
of this event. The recall of previous accumulated knowledge compared to the recall of 
previously presented information may have different relationships to subsequent judg- 
ments. 

Third, our study did not obtain any information on the perceived importance or 
diagnosticity of each of the items recalled. It may be that it is the perceived importance 
or diagnosticity of the atypical versus typical items recalled, rather than the number of 
items recalled, that is associated with predictive judgments. For example, a subject 
could recall four atypical and two typical items and still provide a high probability of 
failure if he or she considered one or two of the typical items to be highly diagnostic of 
failure. 

Fourth, this study did not examine certain aspects of judgments that may affect a ` 
global judgment such as prediction of failure. For example, Hogarth (1987) states that 
memory affects judgments in several ways, including task structuring, selection of 
cues, choice of rules used to process information, and interpretation of outcomes. Re- 
search that examines the link between these items, memory, and the final judgment 
may also help to clarify the relationship between memory and judgments. 
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of a Market? Some Answers 
from the Laboratory 
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SYNOPSIS: The nature of information regulation depends on the 
informational efficiency of capital markets (see Beaver 1989, 152-71; Dyck- 
man and Morse 1986, 82-91). Consequently, researchers in accounting and 
finance have spent considerable effort attempting to measure efficiency. 
Although this investigation has spanned many research designs and has 
been applied to many different information signals, empirical tests all suffer 
from the same basic problem: the benchmark of interest, an informationally 
efficient market, is unobservable. The asset price that would have prevailed 
in an efficient market must therefore be modeled, and the test of market 
efficiency is confounded with a test of the asset-pricing model. Because of 
this ambiguity, whenever a researcher claims to find an abnormal return 
based on some information signal another researcher invariably responds 
that risk was not adequately controlled. For instance, Bernard and Thomas 
(1989, 1990) present evidence .that markets do not adequately adjust to 
quarterly earnings announcements (i.e., there is a significant post- 
announcement drift), while Ball et al. (1990) argue that the market adjust- 
ment may be correct if the level of gek during the announcement period is 
adequately controlled for. | 
Unlike naturally occurring markets, the efficiency of a laboratory 
market can be measured directly by creating another “artificial” economy 
that is identical to the economy of interest, except that all information is 
fully disseminated. The price in the artificial economy is the efficient price 
by definition; it is determined endogenously and without reference to an 
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asset-pricing model. Using this method of measuring a market's efficiency, 
this study investigates how efficiency is influenced by different information 
or market structures. Although such an investigation will not resolve the 
issue of whether naturally occurring markets are efficient, laboratory results 
can identify features SR a market or information structure that aid or impede 
efficiency. 

The study compares two information structures that differ by whether 
there is aggregate certainty in the market; that is, whether the union of al 
traders’ information signals perfectly identifies the value of the risky asset. 
Previous experimental research in market efficiency has used markets with 
aggregate certainty. However, many of the difficulties of decision making 
under uncertainty disappear when the information in the market collectively 
reveals the asset’s payoff. On the other hand, for the experiments con- 
ducted here, there are relatively more signals to aggregate in the markets 
with aggregate certainty. The results show that in markets where different 
traders have different information signals, the presence of aggregate 
uncertainty significantly reduces efficiency relative to similar markets with 
aggregate certainty. However, the results also show. that markets are very 
efficient when some traders have a common but imperfect information 
signal and other traders are. uninformed. In these markets there is 
aggregate uncertainty but no diversity of information among informed 
traders. Thus, diversity of informed traders’ information and aggregate 
uncertainty together lead to inefficient markets, but neither treatment by 
itself causes inefficiency. 

The study also manipulates the number of traders in the market. It is 
sometimes argued that markets are efficient because there are a large 
number of traders whose individual errors average out. However, there is 
no reason to believe that the asset-pricing relation applies equal weight to 
each trader’s belief, so a central limit result may not hold. The results show 
that the number of traders has no significant impact on the efficiency of the 
final prices in a trading period. Within a trading period, however, markets 
with only a few traders converge to the efficient price much. more quickly 
than do markets with many traders. The results also show that there is a 
greater diversity of behavior in the markets with many traders. It is possible 
that this increased diversity increases the number of “noisy” Gebees 
making it more difficult to infer information from market data. 3 

In any investigation of a market's efficiency, different traders must i 
have different information at the time efficiency is being assessed; 
otherwise the market is efficient by definition. Although accounting 
disclosures are publicly available they can effectively generate different 
information signals to different traders. The markets presented here give 
two examples. In the aggregate certainty treatment, some traders received 
good news signals and other traders received bad news signals. An 
example of this type of information system is an economy where different 
traders having different earnings expectation models. In such an economy 
the same earnings report can be oood news to some traders and bad news 
to other traders. As long as the “correct” earnings expectation model is 
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unknown, each trader would find the other traders’ signals—in this case 
their forecast errors~informative. In the number-of-traders treatment, 
some traders receive a signal while other traders do not. An example of this 
type of information system is an economy where some traders receive 
accounting disclosures very quickly by subscribing to a wire news while 
other traders receive the information via third-class mail. Here the 
uninformed traders would benefit by learning the informed traders’ signal. 


Key Words: Market efficiency, Rational expectations, Laboratory markets, 
Experimental economics. 


Data Availability: Tke data used in this study may be obtained from the 
author on request. 


HE paper proceeds as follows. In the first section I discuss the definition of infor- 

mational efficiency and its relation to a fully revealing rational expectations equi- 

librium. In section II I develop two hypotheses predicting how the number of 
traders and the nature of the uncertainty in the market will affect efficiency. I describe 
the design of the markets in section III and present the results in section IV. I conclude 
in section V with an attempt to generalize from the laboratory to naturally occurring 
markets. 


I. Definition of an Informationally Efficient Market 


Let the random variable 5, represent trader i’s information system, where §, is dis- 
tributed jointly with the asset payoff F (which may be a vector). Assume that S, and F 
are not independent, so each S, is informative. Denote the market information system 
for an economy with n traders by the random vector §=(S,,52,...,5,). A realization of 
S, is a trader’s signal; a realization of § will be referred to as the market signal. This 
gives the following definition of an informationally efficient market: 


A market is efficient with respect to the market information system if, for all market sig- 
nals, the price and asset allocation are identical to the price and allocation that would 
arise in an artificial economy with the same configuration of preferences and endow- 
ments but where each trader’s signal is the original market signal. 


This definition follows from Beaver (1981) who summarized it by stating “market 
efficiency with respect to an information item means that prices act as if everyone 
knows that information.’ Efficient prices are equivalent to the prices in an artificial 
economy where all traders receive the market signal. Beaver’s original definition 
required only equivalent prices; I have appended equivalent allocations so that an infor- 
mationally efficient market is a market that achieves a fully revealing rational expecta- 
tions (RE) equilibrium.” This provides the link to previous experimental work investi- 
gating RE equilibria. 

' Beaver (1981) distinguishes between “signal efficiency,” which is only a statement about the efficiency of 
the market with respect to a given market signal realization, and “system efficiency,” which is the definition 
given here. By pooling observations over all realizations of the particular market information system of in- 
terest, both the experimental and empirical literature have focused on system efficiency. 

2? Latham (1988) also shows that appending the equivalence of allocations to Beaver’s definition rules out 


certain pathological cases in which a market can be efficient with respect to one information set (defined as a 
partition) yet Inefficient with respect to a coarser information set. 
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Predecessors to this definition are due to Fama (1970, 1976). Fama (1976, 136) 
defines a market as informationally efficient if “the market is aware of all available in- 
formation and uses it correctly.” As Beaver (1981), Latham (1986), Dyckman and Morse 
(1986, 18-22), and Strong and Walker (1987, 129-33) have pointed out, this definition is 
not well specified. If investors have heterogeneous beliefs and preferences, then what 
does it mean for “the market” to be aware of information or for information to be used 
“correctly?” The operational tests of efficiency in the accounting capital markets litera- 
ture have typically judged a market as efficient with respect to some accounting infor- 
mation disclosure if the average returns from a trading strategy based on the disclosure 
are not significantly different from the returns generated by some version of the CAPM. 
For empirical studies using secondary data, it is necessary to model the efficient price 
with some asset-pricing model, so both Beaver’s and Fama’s definitions may lead to the 
same operational test. The advantage of Beaver’s definition is that it does not depend 
on any particular asset-pricing model; it separates the concept of market efficiency 
from the models and methods used to test the concept. Also, although laboratory mar- 
kets offer many advantages in assessing efficiency as defined by Beaver, with severe 
limits on the number of assets that can be traded, they offer no comparative advantage 
in assessing the CAPM. 

The term “efficiency” is frequently used in another context. In the financial eco- 
nomics literature, an allocation is Pareto efficient if there exist no other allocations that 
dominate the given allocation, where each trader’s welfare index is the expected utility 
of final wealth, given his or her beliefs (see Ohlson 1987, 49). A well-known example of 
a market that generates Pareto efficient allocations is one with a complete set of state- 
contingent claims (or Arrow-Debreu securities). Unfortunately, a market that is infor- 
mationally efficient need not generate Pareto efficient allocations. Informational effi- 
ciency means only that the equilibrium is as if all traders have the common market 
information system, but homogeneous information systems alone do not assure Pareto 
efficient allocations.’ In general, the equivalence of allocations between the actual and 
artificial economies, as required by informational efficiency, does not imply that the 
allocation is Pareto efficient. However, in the particular markets I conduct, there are 
two assets (one risky and one riskless) and, given the market signal, there are no more 
than two payoff states that can occur. Thus, the artificial economy is a complete market 
and, in equilibrium, its resulting allocations will be Pareto efficient.* Furthermore, if 
the actual economy is informationally efficient then traders will have the same beliefs 
and allocations as in the artificial economy, so, for the markets I conduct, informa- 
tional efficiency implies Pareto efficiency. | 


3 Amershi (1985) gives necessary and sufficient conditions for a market to achieve Pareto efficient alloca- 
tions when traders have homogeneous information systems. The conditions involve a combination of restric- 
Hong on preferences and securities. For instance, competitive allocations are Pareto efficient if preferences 
exhibit linear risk tolerance and there is a riskless asset in the market. 

‘ The artificial economy is extremely simple relative to the experimental asset markets in most prior stud- 
ies. With homogeneous information systems and a complete market, the only motive for trade is to share risk 
between traders with different risk preferences. Nonetheless, in the previous studies with aggregate certainty 
(e.g., Plott and Sunder 1988) the Pareto efficient allocation (in a revealing RE equilibrium) was completely 
determined by the exogenous dividend values for the risky asset; in a revealing equilibrium the state of nature 
was known and the Pareto efficient allocation simply allocated all of the risky asset to the trader with the 
highest dividend value in that state. In my study, the Pareto efficient allocation cannot be determined exoge- 
nously but must be measured by reference to the artificial economy. Thus, the allocation in the artificial econ- 
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The definition of an informationally efficient market is a statement about the rela- 
tion between information signals available at a particular point in time and the result- 
ing price and allocation in the market. Thus, a market can be inefficient with respect to 
the market signal at one point in time but efficient with respect to the same market sig- 
nal at some future point in time, possibly after sufficient trading has taken place or 
when the market is in equilibrium. Although not explicitly included in the definition, 
an efficient market is also generally considered to be one that converges to the efficient 
price and allocation quickly. I will measure how the different treatments affect the 
speed of convergence within a trading period and over the entire sequence of trading 
periods in the experiment. 

In the empirical literature, efficiency is frequently assessed with respect to three 
different information systems; all public and private information (resulting in strong- 
form efficiency), all public information (resulting in semistrong-form efficiency), and 
past prices (resulting in weak-form efficiency). However, if accounting disclosures gen- 
erate the same information system for all traders, then the market is efficient with 
respect to this system by definition; it only makes sense to assess the informational effi- 
ciency of markets in which traders have different information systems. When assessing 
efficiency with respect to accounting information, the researcher believes that, at the 
point in time under consideration, either different traders effectively received different 
signals or some traders received the information while others did not. The distinction 
between strong-form efficiency and semistrong-form efficiency appropriately differen- 
tiates between the sources of the raw information (and perhaps the legality of using it), 
but beyond that the distinction is more apparent than real. The previous empirical liter- 
ature has not described how accounting disclosures give rise to different information 
systems because these systems are largely unobservable. However, in a laboratory the 
experimenter explicitly specifies each trader’s information system. 


II. Hypotheses about Market Efficiency 


One way that markets might become efficient is if traders learn other traders’ sig- 
nals by observing market data (such as prices}. This is the RE hypothesis. Inefficient 
markets arise because there is noise in market data which prohibits traders from learn- 
ing the market signal. In this framework, hypotheses about what causes markets to be- 
come efficient describe how the noise in market data is a function of some attribute of 
the market or information structure. I develop two hypotheses, one related to the type 
of information in the economy and one related to the number of traders in the market. 

The first hypothesis is suggested by the results of previous research. Briefly, Plott 
and Sunder (1982) considered the case in which one group of traders was perfectly in- 
formed about the state outcome and another group was uninformed. They found that 
the markets converged to the RE prediction very ouicklv 5 Plott and Sunder (1988) con- 
sidered a more severe test of the RE prediction. In this study, two trader groups had dif- 


8 The success of the RE model in markets with perfectly informed and uninformed traders is remarkably 
robust. Plott and Sunder (1982) did the original study, which used double oral auctions and an equal number of 
informed and uninformed traders whose identity remained the same throughout the experiment. Banks (1985) 
replicated the result in markets where the informed traders varied randomly. DeJong et al. (1990) replicated the 
result in a computerized exchange, which eliminated all non-price information. 
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ferent information signals that collectively identified the state outcome. In this case, an 
efficient market must correctly aggregate the different signals. They found that com- 
plete markets and markets in which all traders had identical payoffs for the risky asset 
were efficient. However, markets were not efficient if they were incomplete, different 
traders had different payoff schedules for the risky asset, and the payoffs were not com- 
mon knowledge. Forsythe and Lundholm (1990) confirmed the inefficiency of incom- 
plete markets with diverse payoff schedules, even subjects were experienced. They also 
showed that making the different payoffs common knowledge was sufficient to reach 
an RE equilibrium with experienced subjects. Removing some of the uncertainty re- 
garding other traders’ incentives to trade reduced the noise in market data sufficiently 
to allow the detection of the market signal. 

There are two observations to be made from the previous experimental work. First, 
reducing the amount of uncertainty regarding other traders’ motives for trading in- 
creases the efficiency of the market. Second, the information structure of the previous 
experiments was one of aggregate certainty; the union of all traders’ signals (i.e., the 
market signal) perfectly identified the value of the asset. Drawing inferences from 
market data is considerably easier when there is aggregate certainty in the information 
structure. With aggregate certainty, there is no uncertainty in a fully revealing RE equi- 
librium, so the price doesn’t depend on traders’ risk preferences. If each trader’s payoff 
from the risky asset is common knowledge, then in a revealing equilibrium each trader 
knows all other traders’ limit prices for the asset and, consequently, their incentives to 
trade. In short, with aggregate certainty in the information structure, each trader has 
sufficient information to compute the fully revealing RE equilibrium ex ante. 

In contrast, with aggregate uncertainty, the union of all traders’ signals does not 
identify the risky asset’s payoff. In this case, the equilibrium price depends on the risk 
preferences of all the traders. Because preferences are generally unobservable, traders 
cannot unambiguously identify other traders’ incentives to take different actions even 
in a fully revealing RE equilibrium. Furthermore, traders may not be expected-utility 
maximizers. In this case, the possibilities for “unusual” behavior in the face of uncer- 
tainty are potentially greater in markets with aggregate uncertainty. In sum, previous 
experimental work has been conducted in markets with the unique feature of aggregate 
certainty. It is possible that markets will remain inefficient with aggregate uncertainty, 
either because traders will not be sufficiently informed regarding one another’s incen- 
tives to trade or because their behavior, given their information, will generally become 
less predictable. 

On the other hand, for my experimental design, the markets with aggregate cer- 
tainty have more signals in the information structure than the markets with aggregate 
uncertainty. Because there are more signals to aggregate in an efficient price, the re- 
quirements of an efficient market are more severe. Although the markets with aggre- 
gate certainty may have features that aid in learning other traders’ signals, there are 
more signals to learn. Which effect will dominate depends on the process by which 
traders learn the market signal. 

To examine how aggregate uncertainty affects market efficiency, I first conduct 
markets in which the collection of signals perfectly identifies the asset value (i.e., aggre- 
gate certainty). I then conduct markets that are identical to the previous ones except 
that one of the information signals is removed so that the asset value is no longer per- 
fectly identified De, aggregate uncertainty). 
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The second hypothesis about market efficiency concerns the number of traders in 
the market. Various authors have suggested that a large number of traders is a sufficient 
condition for efficient markets. For instance, Fama (1970, 388) states that ‘‘the market 
may be efficient if ‘sufficient numbers’ of investors have ready access to available infor- 
mation.” The idea is that each trader’s idiosyncratic error, the result of imperfect or 
costly information processing, will average out as the number of traders increases 
(Bagehot 1971; Beaver 1989). However, this argument is incomplete. It explains why the 
average of traders’ forecasts is a superior forecast to any individual forecast, but it does 
not explain how the market price correctly weights each individual’s forecast. Al- 
though most asset-pricing models put more weight on the forecasts of traders with 
more precise information, they also put more weight on the forecasts of traders who are 
more risk tolerant, regardless of the precision of their information. Furthermore, each 
trader’s forecast manifests itself in price through the trader’s demand, which may also 
be influenced by the trader’s idiosyncratic behavior. Although adding another trader to 
the market adds another forecast to the price function, the trader’s idiosyncratic be- 
havior also adds noise. Which effect will dominate depends on the particular features 
of the “true” asset-pricing model operating in the economy.® 

To examine how the number of traders influences market efficiency I conduct mar- 
kets with informed and uninformed traders, where the informed traders have imperfect 
information regarding the state outcome De, there is aggregate uncertainty). In these 
markets, I consider how efficiency changes when the number of traders doubles, hold- 
ing the fraction of informed traders constant. 


Ill. The Laboratory Markets 


In presenting the design of the experiments, I will first describe the details of each 
market and then explain the design choices. The design of each market is summarized 
in table 1. 

A total of nine markets were conducted. Except for market 9, each was conducted 
over two consecutive nights, and the second night was simply a continuation of the pre- 
vious night.” Except for market 9, a new set of subjects was recruited for each market. 
Markets proceeded as a sequence of periods. In each period, traders were endowed 
with two units of a risky asset, which had a one-period life. There was zero supply of a 
riskless asset (called francs), which served as the numeraire. Traders were endowed at 
the beginning of each period, and holdings at the end of the period could not be carried 
forward into the next period. The payoff schedule for the risky asset was constant over 
the two nights, identical for all traders, and common knowledge. 


6 An appendix available from the author presents two simple asset-pricing models. As the number of 
traders increases, prices become more efficient in the first model and less efficient in the second. 

7 Although conducting a market over two consecutive nights has the obvious advantage of increasing the 
number of replicate observations, the method suffers from the disadvantage that the experimenter loses control 
of the subjects for 24 hours. To reduce the possibility of subject interactions outside the laboratory, subjects 
were randomly recruited from a large pool, so it was unlikely that they had strong social ties. Also, at the end of 
the first night, subjects were told that the following night would be “an experiment in the economics of market 
decision making,” which was the exact language used to recruit subjects for many different types of experi- 
ments on campus. The idea was to be sufficiently vague that they weren't sure what the next night’s task would 
be. To reduce the possibility of “no shows” on the second night, subjects were informed at the beginning of the 
first night that it was imperative that they participate both nights, and at the end of the first night they were 
given an IOU but no money for their night's earnings. All subjects returned for the second night in all markets. 
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Table 1 
Experimental Design and Market Parameters 





Number of | Number of ` Number of 


Market Type’ Traders States? Clues? Distribution of Clues* 

1 AC 12 4 3 Evenly between three groups 
2 AC 12 4 3 Evenly between three groups 
3 AU 12 4 2 Evenly between two groups 

4 AU 12 4 2 Evenly between two groups 

5 N12 12 3 1 Only six traders receive clue 
8 N12 12 3 1 Only six traders receive clue 
7 N6 6 3 1 Only three traders receive clue 
8 N6 6 3 1 Only three traders receive clue 
9 3rd night AUS 10 4 2 Evenly between two groups 


‘ AC means aggregate certainty. AU means aggregate uncertainty. N12 means 12 traders. N6 means six 
traders. All these markets were conducted for two consecutive nights. 

2 When there were four states the dividends were 0, 100, 400, and 700 francs, respectively. When there 
were three states the dividends were 100, 400, and 700, respectively. 

1 A clue identified one of the state outcomes that did not occur. 

‘ In all cases the distribution of clues among traders was random each trading year, subject to the given 
constraints. 


® Subjects in market 9 had participated in either market 3 or market 4. This market was conducted one 
night only. 


All markets were organized as oral double auctions, and all bids, offers, and con- 
tracts were made publicly and publicly recorded. Traders were free to buy or sell units 
of the risky asset anytime during the auction, and there was no limit to the amount of 
the riskless asset they could borrow to finance their purchases.* However, no short 
sales of the risky asset were permitted. All trades and payoffs were denominated in 
francs, which were converted to dollars at the end of the experiment at the rate of 
$0.0017 per franc. Subjects in markets 1 through 4 and market 9 were students at the 
University of Iowa, and subjects in markets 5 through 8 were students at Stanford 
University. With one exception, no comparisons are made across the different subject 
pools. The average amount earned by each subject was $46 for four to six hours of 
participation. 

The instructions were read aloud to all traders at the beginning of each oiebt 
These instructions stated that the occurrence of a state outcome and the combination of 
signals, called clues in the experiment, would depend on the number of a ball drawn 
from a bingo cage. All traders were then given some experience with the generation of 
clues and state outcomes based on draws from the bingo cage. The bingo cage was 
operated in full view of all market participants for the duration of the experiment, and 
each resulting state outcome was publicly verified to be consistent with the described 
procedure at the end of each period. 


è To make this less confusing in the experiment, each trader was endowed with 10,000 francs at the begin- 
ning of each period and charged a fixed cost of 10,000 francs at the end of each period. Although this limits the 
amount of borrowing in principle, no trader was ever constrained by this limit. 

°? The instructions were straightforward adaptations of the instructions published in Forsythe and Lund- 
holm (1990) and are available on request. 
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Markets 1 through 4 were designed to study the difference between aggregate 
certainty and aggregate uncertainty. In these markets, the risky asset had four different 
payoff-relevant states. In state W, each unit of the risky asset paid 0 francs. In states X, 
Y, and Z, the payoff was 100, 400, and 700 francs, respectively. The prior probability of 
each state was 1/4 (i.e., an equal number of bingo balls was assigned to each outcome). 
There were 12 traders in markets 1 through 4. Markets 1 and 2 will be referred to as the 
AC (aggregate certainty) markets and markets 3 and 4 will be referred to as the AU 
(aggregate uncertainty) markets. 

In the AC markets, traders were randomly divided into three groups of four for 
each period, and each group received a different clue informing them that one of the 
possible states of nature did not occur. For instance, if the state outcome was W, then 
the three clues distributed among the groups were “not X,” “not Y,” and “not Z.” 
Which trader received which signal was not disclosed, but the process by which signals 
were generated and distributed was described in the instructions. 

Because there was aggregate certainty in the AC markets, there was no need for an 
artificial economy to determine the efficient price. For markets with aggregate cer- 
tainty, the efficient price is simply the asset’s payoff in the realized state, which is per- 
fectly identified by the market signal. The efficient allocation, however, is indetermi- 
nate. Because the asset’s payoff is known with certainty in an efficient AC market, all 
traders would be willing to trade to any allocation at a price equal to the asset’s payoff. 
Consequently, any allocation would be efficient. 

In the AU markets (markets 3 and 4) there was aggregate uncertainty about the 
asset’s payoff. The markets proceeded as a sequence of replicate “years,” and each year 
was divided into two periods. Period A was the economy of interest and period B, 
which followed immediately after period A, was the corresponding artificial economy. 
In period A, traders were randomly divided into two groups of six, and each group re- 
ceived a clue eliminating one of the possible outcomes. Because the market signal elim- 
inated only two of the four possible payoffs, there was aggregate uncertainty regarding 
the asset’s payoff. Under these conditions, traders were endowed, received their infor- 
mation, traded, and payoffs were realized. In period B, the market was reinitialized 
(i.e., traders were endowed, but their period A holdings were not available for trading), 
and the market signal from period A was announced publicly. At the end of period B, 
the outcome was determined by another independent draw from the bingo cage, with 
the probability of each outcome determined by the market signal in period A. For exam- 
ple, if the clues in period A were “not W” and “not X,” then the possible outcomes in 
period B were Y or Z, and each was equally likely.'® 

Although the periods occurred in pairs, the outcome of period B was independent 
of the outcome in period A, conditional on the period A market signal. The economy in 
period B was identical to that of period A, with the exception that each trader knew the 
market signal. By definition, the period B price and allocation is the efficient price and 
allocation for the period A economy."! 


"9 There are six different ways to eliminate two state outcomes from a set of four so in principle there are six 
different possible signal combinations. However, only four of the possible combinations were allowed. This 
was done to match the number of possible market signals in the AU and AC markets. The two clue combina- 
tions that were eliminated were (not W—not Y) and (not X—not Z). 

" A behavioral hypothesis might predict that period B prices will be close to period A prices for reasons 
very different from market efficiency. The prediction is that subjects will ignore the probabilities of different 
outcomes in period B and trade at prices similar to those they traded at in period A simply because those are the 
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Previous experimental work in the aggregation of diverse information has used 
three states of nature and two clues (Plott and Sunder 1988; Forsythe and Lundholm 
1990). It was necessary to increase this to four states and either two or three clues in 
order to compare aggregate uncertainty with aggregate certainty and still have diverse 
information to aggregate in an efficient market. I have made other design choices to 
compensate for this additional complexity. Specifically, the payoff distribution of the 
risky asset was common knowledge and identical for all traders, and the market was 
conducted for two consecutive nights. Plott and Sunder found that markets were effi- 
cient when all traders had identical payoffs, even though the payoff schedule was not 
common knowledge. Forsythe and Lundholm found that markets in which traders had 
diverse payoffs were also efficient if the payoff distribution was common knowledge 
and traders were given additional experience by conducting the market for two con- 
secutive nights. Other features of the markets in this study—12 traders, no short selling, 
the endowment levels, random assignment of traders to groups in each trading 
year—were chosen to be consistent with the markets in Forsythe and Lundholm (which 
are very similar to the markets in Plott and Sunder). 

Market 9 was identical to the AU markets except that there were only ten subjects, 
all of whom had participated in either market 3 or market 4, and the market was con- 
ducted for one night only. The purpose of market 9 was to observe an AU market with 
more experienced subjects. Why this might be important will be discussed in the results 
section. 

Markets 5 through 8 were designed to study the importance of the number of 
traders in the market. In these markets, the risky asset had three payoff-relevant states. 
The dividends in states X, Y, and Z were 100, 400, and 700 francs, respectively, and the 
prior probability of each state was 1/3. The information structure in markets 5 through 
8 was one of aggregate uncertainty, so each trading year was divided into periods A and 
B as in the AU markets. Prior to each period A, half of the traders were randomly 
selected and given a clue that eliminated one of the possible state outcomes. The re- 
maining group of traders received no clue.'? In period B, the market signal De, the 
informed traders’ clue in period A) was publicly announced, as in the AU markets. 





terms of trade that are freshest in their minds. If this is the case, however, then we would also expect the period 
B prices in year N to be similar to the period A perices in year N+1. To test this hypothesis, I computed the 
Pearson correlations between the first, last, and average period B prices with the next year’s first, lest, and 
average period A prices. There were 35 trading years in total, when all markets using the period A-period B 
design were aggregated, and none of the correlations was significant below the 0.25 level. 

There is one difference between periods A and B that cannot be controlled. Since pertod B follows A chro- 
nologically, each trader’s total wealth at the beginning of period B is, on average, $1.44 higher than it was at the 
beginning of period A. I assume throughout the study that such wealth effects are negligible. 

Another design would be to use the Berg et al. (1986) procedure to induce a specific set of preferences. This 
would allow the price in period B to be determined analytically rather than empirically (see O'Brien 1988). This 
method has the benefit of providing exogenous predictions for the price and allocation in period A, but it 
requires more assumptions than the present design does when assessing efficiency. The theoretically deter- 
mined efficient price and allocation predictions will be correct only if (1) the preference induction technique 
works, and (2) the model of asset pricing is correct. Ata minimum such a model would assume traders are 
expected-utility maximizers and that the market is in competitive equilibrium. The design used here makes no 
such assumptions. Prices and allocations are simply observed; the process that generates them is unimportant. 

‘2 Randomly selecting the informed traders each year is consistent with Banks (1985) but inconsistent with 
Plott and Sunder (1982), where the Informed traders remained the same throughout the experiment. Random- 
izing the informed traders eliminates the possibility that uninformed traders know the identity of the informed 
traders prior to the opening of the market. l 
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There were 12 traders in markets 5 and 6, which will be referred to as the N12 markets, 
and there were six traders in markets 7 and 8, which will be referred to as the N6 mar- 
kets. 

Twelve traders is the upper bound for an oral double auction of this type. With 
more traders, the auctioneer cannot distinguish between different traders’ actions and 
still execute trades in a timely fashion. With 12 traders as an upper limit, six was con- 
sidered the smallest number that would still maintain competition between the in- 
formed traders. Forsythe and Lundholm (1990) found that three traders per group 
created sufficient competition between traders with the same information (and payoffs) 
to eliminate any significant market power.'*: 

Markets 5 through 8 were designed with the primary purpose of measuring the dif- 
ference in efficiency between markets with different numbers of traders. However, 
they were conducted after markets 1 through 4, and designed in light of the results from 
these markets. Specifically, the AU markets were relatively inefficient. The N12 mar- 
kets, with only one signal, measure the impact of aggregate uncertainty in a simpler set- 
ting; one that does not require the aggregation of diverse information. Contrasting the 
results between the AU and N12 markets will assess the effect of the additional com- 
plexity introduced with diverse information." 

- Table 1 gives the overall experimental design. The contrast between the AU and AC 
markets measures the effect of aggregate uncertainty on efficiency. Market 9 provides 
some additional evidence on this point. The contrast between the N6 and N12 markets 
measures how the number of traders influences efficiency. Finally, the comparison 
between the AU and N12 markets measures the change in efficiency when more than 
one signal must be aggregated and communicated through the market process. 


IV. Results 


I will assess the effect of the aggregate uncertainty treatment and the number of 
traders treatment in terms of the efficiency of prices, the efficiency of allocations, and, 
for the markets that eventually became efficient, the speed of convergence to the effi- 
cient price. 

Informational efficiency is defined categorically, so that a market’s price and allo- 
cation are either efficient or inefficient. With this definition, a single transaction away 
from the efficient price or allocation would be sufficient to reject efficiency. A more 
useful metric is some measure of distance away from efficient prices and allocations. I 
adopt as a measure of price inefficiency the absolute deviation from the efficient price. 
This is the amount a trader could earn on a single transaction by trading at the current 
price and later reversing his or her position at the efficient price. Except for scaling by 
the beginning price, this metric is very similar to an “abnormal return,” which is the 
measure of price inefficiency used by the empirical literature. 

Although the definition of market efficiency refers to a single price, trades in a 


©“ The competitive tendencies of the double auction trading institution have been thoroughly documented. 
See Smith (1982), Plott (1982), or, for an application in asset markets with private information, Friedman and 
Ostroy (1989). 

"7 This is the aggregate uncertainty equivalent to the contrast between Plott and Sunder (1982), which had 
two clas and one information signal and Plott and Sunder (1988), which had three states and two Information 
signals. 
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double oral auction can take place at different prices, so there is a need for a convention 
about which price or prices will be used to represent the period. In period A, traders 
have differential information and, consequently, transaction prices are hypothesized to 
be informative. The last transaction in period A will be used to represent the period A 
price because it is made with the benefit of observing all prior market data. In contrast, 
there is no information transmission regarding the state outcome in the period B 
markets; all trades should be due to differential preferences for the lottery created by 
the risky asset. For period B, the average of the transaction prices is used to represent 
the period’s price. Thus, the operational measure of price inefficiency is the absolute 
difference between the last transaction price in period A and the average transaction 
price in period B. 

The experimental literature has used the number of misallocated securities relative 
to the Pareto efficient allocation as an index of Pareto efficiency.'® Because informa- 
tionally efficient allocations are also Pareto efficient allocations in my markets, this 
metric measures both notions of efficiency. The operational measure of allocation in- 
efficiency is the number of misallocated units in the period A allocation relative to the 
period B allocation, as a percentage of the total supply of the risky asset. 

An analysis of the first night of each market reveals that prices and allocations are 
very inefficient, which is similar to the results in Forsythe and Lundholm. Subjects 
have much to learn at the beginning of the market, apart from learning how to draw in- 
ferences from other traders’ actions. Trading in a double oral auction is not simple. 
Along with becoming accustomed to the special language of the auction, traders are 
gaining experience with the process that generates their information each period and 
learning how they make or lose money. The first night of every market is essentially a 
training session. Consequently, only the second night of each market is used for effi- 
ciency comparisons. 1 

Figures 1 through 9 present the sequence of prices less the expected value of the 
asset based on the market signal for each trading year for the second night of each mar- 
ket (subtracting the expected value makes it possible to use a smaller range on the 
graphs). At the top of the price sequence for a given year is the market signal, listed as 
the state outcomes that remain possible after considering all signals (and each remain- 
ing state is equally likely). 

As discussed in the design section, markets 1 and 2 did not require an artificial 
economy to determine the efficient market price. For these markets the dotted line rep- 
resents the efficient market price (less the expected value). For markets 3 through 9 the 
prices in period A (the economy of interest) are represented by solid dots, and the prices 
in period B (the artificial economy) are represented by open squares." Recall that pe- 
riod B was conducted immediately following period A within each trading year. 


'S The metric is frequently given as the fraction of the maximum gains from trade actually taken. This is 
simply a scaled variation of the number of misallocated securities; if n is the number of misallocated units and 
N is the total supply of units, then trading efficiency equals 1—{2n/N). The number of misallocated units can 
also be thought of as the number of trades necessary to reach the efficient allocation. 

‘6 The data for the first night of the markets is available on request. 

Another symbol appears on a few of the graphs. In some markets there was very little trade and conse- 
quently the bid/ask data at the end of the period reflected more information than the final transaction price 
would indicate. Whenever there were less than five transactions in a period, a A({V) was placed on the graph if 
the final bid (offer) was greater (less) than the final transaction price. This occurred five times in market 2, and 
once each in markets 7 and 8. In these cases, this bid (offer) was used as the final price in the period. 
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Figure 1 
AC Design—Market 1 Second Night Adjusted Price Series 
(price minus expected value using the market signal) 





The Effect of Aggregate Uncertainty on the Efficiency of Prices 


As seen in figure 1, years 16 through 21 of the second night of market 1 were very 
efficient, with an average absolute deviation between the last price and the efficient 
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Figure 2 
AC Design—Market 2 Second Night Adjusted Price Series 
(price minus expected value using market signal) 





Note: A (V) represents the final bid (offer) if there were less than five transactions in a period and the final 
bid (offer) was greater (less) than the final transaction price. 


price of only 16 francs per period; which is less than three cents. Similarly, years 18 
through 21 of market 2 were very efficient with an average absolute deviation of 17 
francs, as seen in figure 2. Prior to the end of the AC markets, however, there is little 
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Figure 3 
AU Design—Market 3 Second Night Adjusted Price Series 
(price minus expected value using market signal) 
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consistent evidence in support of an efficient market. Prices were generally less re- 
sponsive to the market signal than predicted by the RE model. That is, they were higher 
than the efficient price when the efficient price was low and lower than the efficient 
price when the efficient price was high. The relatively long time these markets took to 
converge to the efficient price is in contrast to the uniform dividend markets in Plott 
and Sunder (1988). In similar markets, but with only three states and two clues, Plott 
and Sunder found that one market reached the RE prediction in 12 trading years and 
another reached it immediately. The additional complexity of aggregating the extra 
clue in my markets caused a significant decrease in the speed of convergence in the AC 
markets relative to the Plott and Sunder markets. 

Unlike the final years of the AC markets, the final years of the AU markets re- 
mained very inefficient, as seen in figures 3 and 4. The average absolute deviation of - 
the final period A price from the average period B price is 106 francs for the last three _ 
years of markets 3 and 4. Period A prices do not respond sufficiently to the market sig- 
nal, remaining below the efficient price when the asset’s expected payoff is high and 
above the efficient price when the asset’s expected payoff is low. Figure 10 plots the 
average absolute deviation from the efficient price for the final ten years of the AU mar- 
kets and the AC markets. As seen in the figure, the AU markets were generally less effi- 
cient than the AC markets. As a formal test, I used a Wilcoxon rank sum statistic to test 
the hypothesis that the average absolute price deviations in the AC markets are differ- 
ent than in the AU markets.. Based on the final realization of each possible market sig- 
‘nal, the test showed that the AC markets have significantly lower price deviations, with 
a p-value of 0.008 (16 observations; eight from AC markets and eight from AU markets). 


t 
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Figure 4 
AU Design—Market 4 Second Night Adjusted Price Series 
(price minus expected value using market signal) 
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The significance level is 0.099 when the last ten years of the AU and AC markets are 
compared (40 observations). 18 

A dilemma arises when comparing the results from the AC and AU markets. Mar- . 
kets in both sets consisted of approximately the same number of trading periods 
(between 30 and 34 for the first and second nights combined). However, because it was 
not necessary to conduct a period B for the AC markets, these markets realized roughly 
twice as many market signals. Consequently, the AC and AU markets are matched in 
terms of the number of trading periods, but are not matched in terms of the expected 
number of realizations of the market signal. If convergence depends on the number of 
realized market signals, then the previous comparison is biased in favor of the AC mar- 
kets. However, if convergence depends on learning about other traders’ behavior under 
uncertainty, then period B data is potentially very informative because it is not con- 
founded with the effects of differential information. If this is the case, the previous test ` 
is biased against the AC markets. . 

If convergence depends on the number of realized market signals, then we would 
expect a negative correlation between the number of times a given signal is realized and 
the average absolute price deviation for the final occurrence of that signal. This does 
not appear to be the case, however. The Spearman rank-order correlations are positive 
(and insignificant) for both the AU and AC markets (eight observations each). As 
another test, I compared the average absolute price deviations of the AC and AU mar- 


'* In both significance tests, the data were pooled over all different market signal realizations. To verify the 
homogeneity of the pool, I performed a Kruskal-Wallis test for differences in mean absolute price deviations for 
different market signals in the second-night markets (this test is the k-group equivalent of a Wilcoxon test). The 
significance level was 0.28 for the AU markets (19 observations) and 0.25 for the AC markets (42 observations). 
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Figure 5 


N12 Design—Market 5 Second Night Adjusted Price Series 
(price minus expected value using market signal) 


1 2 3 4 5 6 7 8 9 Year 
; State 


200 


150 





Figure 6 
N12 Design—Market 6 Second Night Adjusted Price Series 
(price minus expected value using market signal) 
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Figure 7 
N6 Design—Market 7 Second Night Adjusted Price Series 
(price minus expected value using market signal) 
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Figure 8 
N6 Design—Market 8 Second Night Adjusted Price Series 
(price minus expected value using market signal) 
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Note: A (V) represents the final bid (offer) if there were less than five transactions in a period and the final 
bid (offer) was greater (less) than the final transaction price, 
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Figure 9 
Third Night AU Design—Market 9 Adjusted Price Series 
(price minus expected value using market signal) 
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Figure 10 
Average Absolute Deviation from Efficient Price: Second Night AC and AU Markets 
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Figure 11 
Avera Absolute Deviation from Efficient Price: Market 9 and Second Night AC Markets 
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kets for only the subset of data in which there was an equal number of market signal 
realizations. The comparison between the price deviations of the AU and AC markets 
in data subset is matched by the total number of signal realizations. Using the final 
realization of each signal in this dataset, the average absolute deviation is 15.2 francs 
for the AC markets and 87 francs for the AU markets. On the basis of a Wilcoxon test, 
the AC markets are significantly more efficient, with a p-value of 0.026. 

Market 9 was conducted to collect additional evidence free of the dilemma regard- 
ing the appropriate match between the AU and AC markets. In an attempt to get as 
many realized market signals in an AU market as occurred in the AC markets, I re- 
cruited ten subjects from the AU markets to participate for a third night in an AU-type 
market.”° The comparison between the last ten trading years of the AC markets with 
market 9, the third-night AU market, is clearly biased in favor of the AU design. Sub- 
jects in market 9 observed as many realized signals and 20 more- total periods than sub- 
jects in the AC markets. 

Figure 9 presents the time series of transaction prices. As seen, the sequence of 
realized market signals in market 9 was ideal for learning the relationship between the 
market signal and market data. With only one exception, all the realizations of each 
market signal occurred consecutively, and only three of the four possible signals oc- 
curred. Nonetheless, market 9 was still somewhat less efficient than the AC markets, as 
shown in figure 11. The average absolute deviation of prices is 58.3 francs in the last 


IG For the AU markets, there were seven realizations of the market signal “W or X” in market 4 and five 
realizations of the market signal “Y or Z” in market 3. Similarly for the AC markets, there were seven realiza- 
tions of the signal “X” and the signal “Y” in market 1, seven realizations of the signal “W” and the signal “Z” in 
market 2, and five realizations of the signal “Z” in market 1. 

2 | wanted 12 subjects, but only ten (out of the 24 possible) were available. 
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four trading years of market 9, compared with an average deviation of 17 francs in the 
last four years of the AC markets. The Wilcoxon test favors the AC markets, with a 
p-value of 0.109 based on the final realizations of each signal (eight AC observations 
plus three market 9 observations) and 0.298 based on all ten trading years of market 9 
and the final ten years of the AC markets (20 AC observations plus ten market 9 obser- 
vations). 

The results are consistent with the hypothesis that aggregate uncertainty hinders 
the convergence to the efficient price. The final trading years of the AC markets were 
more efficient than the second-night and third-night AU markets. Furthermore, the AC 
markets had three signals to aggregate while the AU markets had only two. Even with 
more information to aggregate than the AU markets, the ease of learning in an environ- 
ment with aggregate certainty caused this information to be aggregated and communi- 
cated in the AC markets much more effectively. 

Relative to previous experimental studies, the AU markets had many features that 
should have aided in achieving an RE equilibrium. All market parameters were com- 
mon knowledge, the payoff of the risky asset was identical for all traders, and there 
were two nights of participation. Nonetheless, the AU markets were considerably less 
efficient than both the AC markets and the markets in other experimental studies. The 
poor performance of the AU markets casts some doubt on the RE model (and efficient 
markets} when there is diverse information to aggrgate in the presence of aggregate un- 
certainty. Given this finding, the results from previous research suggest a stronger test 
of the impact of aggregate uncertainty. It has been established that markets in which 
some traders are perfectly informed while other traders are uninformed achieve an RE 
equilibrium quickly; much more quickly than markets in which informed traders have 
diverse information.” The AU markets had diversely informed traders. If the presence 
of aggregate uncertainty reduces the efficiency of markets, a stronger test would be in 
markets with uninformed traders and informed traders who observe a common but im- 
perfect signal. Given the results from the AU markets, it is possible that aggregate 
uncertainty is such a severe impediment that even these markets would be found ineffi- 
cient. Although not their primary purpose, the next set of experiments provide evi- 
dence contrary to this hypothesis. 


The Effect of Diverse Information on Market Efficiency 


Recall that in the N12 markets six traders received a signal eliminating one of the 
possible three states, while the other six traders remained uninformed. The N12 mar- 
kets are less complicated than the AU markets because there is no diverse information 
to aggregate, but they are more complicated than the markets in previous experimental 
studies because the informed traders do not have perfect information regarding the 
state outcome. Comparing the N12 markets with the AU markets measures the differ- 
ence between aggregating diverse information and communicating a single piece of in- 
formation in the presence of aggregate uncertainty.” 


"7 To support this claim, contrast the strong results of Plott and Sunder (1982) and the associated follow-up 
studies cited in footnote 5 (where traders were either perfectly informed or uninformed) with the slower con- 
EE in Plott and Sunder (1988) and Forsythe and Lundholm (1990) (where traders were diversely in- 

ormed). 

z2 A potential confounding factor in this comparison is that there were four possible market signals in the 
AU markets, but only three in the N12 markets. Thus, the second-night AU markets averaged 4.75 replications 
of each market signal, while the second-night N12 markets averaged six replications. 

With the diversity of information as an added factor (which is present in the AU and AC markets and not 
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Figure 12 
Average Absolute Deviations from Efficient Prices: Second Night AU and N12 Markets 
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As seen in figures 5 and 8, both N12 markets were very efficient for most of the 
second night. Holding aside trading years 2 and 8, the second night of market 5 had an 
average absolute deviation of seven francs. Market 6 had similar results, with an aver- 
age absolute deviation of eight francs during years 3 through 9. Both deviations are less 
than two cents. Although the N12 markets converged much slower than Plott and Sun- 
der’s (1982) markets with perfectly informed traders, the fact that they eventually 
reached the RE equilibrium predictions shows that aggregate uncertainty is not an in- 
surmountable obstacle. Figure 12 compares the price efficiency for second-night AU and 
N12 markets. As seen in the figure, the additional complication of aggregating diverse 
information in the AU markets caused a dramatic reduction in price efficiency. Based 
on a Wilcoxon test, the price deviations in the N12 markets are significantly less than in 
the AU markets at the 0.005 and 0.001 levels, using only the last realizations of each 
market signal (six N12 observations plus eight AU observations) and all years in the 
second-night markets (18 N12 observations and 19 AU observations), respectively. 

Figure 13 compares the average deviation from the efficient allocation between the 
N12 and the AU markets. Unlike the distinct difference in price performance between 
the two information structures, there is no significant difference in terms of allocation 
efficiency. The significance levels of the Wilcoxon test are 0.302 and 0.113 for the com- 
parison using only the last realization of each market signal and the comparison using 
all years, respectively. 

Taken together with the AU/AC contrast, these results suggest that diverse infor- 





present in the N6 and N12 markets) this study can be seen as an incomplete factorial design with three factors— 
aggregate uncertainty, number of traders and diversity of information. An interesting possibility for future re- 
search is to complete the factorial design by running the AU and AC markets with six traders and running the 
N6 and N12 markets with aggregate certainty (i.e., have the informed traders be perfectly informed about the 
state outcome). 
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Figure 13 
Deviation from Efficient Allocation: Second Night AU and N12 Markets 
(as a percent of total supply) 
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mation by itself does not hinder efficiency {because the AC markets were efficient), 
aggregate uncertainty by itself does not hinder efficiency (because the N12 markets 
were efficient), but diverse information and aggregate uncertainty does impede effi- 
ciency (because the AU markets were inefficient). 


The Effect of the Number of Traders on the Efficiency of Prices 


The primary purpose for conducting the N12 and N6 markets was to study how . 
changing the number of traders affects the efficiency of markets. As seen in figures 7 
and 8, prices in the N6 markets were efficient for much of the second night. The aver- 
age absolute price deviation for the second night of market 7 was 24 francs, holding 
aside years 2 and 8. For the second night of market 8, the deviation was 11 francs, hold- 
ing aside year 5. 

Figure 14 presents a plot of the average absolute deviation from the efficient price 
over the nine years for the N6 and N12 markets. As seen in the figure, the N12 markets 
are only slightly more efficient than the N6 markets. Averaging over all years in the 
second night, the absolute deviation for the N12 markets is only six francs less than the 
absolute deviation for the N6 markets. Using a Wilcoxon test based on the last realiza- 
tion of each market signal, there is no significant difference in the price deviations, 
with a p-value of 0.522 (12 observations). If the test uses all years in the second night, 
the p-value is 0.165 (36 observations). 


23 For both significance tests, the data were pooled over all different market signal realizations. To verify 
the homogeneity of the pool, I performed a Kruskal-Wallis test for differences in mean absolute price deviations 
for different market signals in the second-night markets. The significance level was 0.59 for the N6 markets and 
was 0.23 for the Ni2 markets (18 observations for each test). 


Lundholm—Efficiency of a Market 509 


Figure 14 | 
Average Absolute Deviations from Efficient Prices: Second Night N6 and N12 Mar 
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Figure 15 
Deviation from Efficient Allocation: Second Night N6 and N12 Markets 
(as a percent of total supply) 
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The Effect of the Number of Traders on the Efficiency of Allocations 


Figure 15 compares the allocation deviations of the N12 and N8 markets. As seen, 
the average deviation is smaller for the N6 markets in every trading year. Comparing 
the deviations using a Wilcoxon test yields significance levels of 0.029 when comparing 
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the last realization of each signal, and 0.001 when comparing all years In the second 
night. As will be seen, the N6 markets are significantly more efficient in terms of final 
allocations because they take fewer trades to converge to the efficient price.” 


The Effect of the Number of Traders on the Volume of Trade 


The most notable difference between the N6 and N12 markets is the volume of 
trade. For the second-night markets, the N6 markets averaged 2.8 trades per period 
while the N12 markets averaged 14.2 trades per period. Although the supply of the risky 
asset doubled between the N12 and N6 markets, the volume of trade increased fivefold. 
Furthermore, the difference in volume occurs in period B as well as in period A, and 
the only reason for trade in period B is to adjust for differing preferences for the lottery 
represented by the risky asset. Because the volume difference arises in priod B, it must 
be due to differences in the composition of preferences, and not an artifact of differen- 
tial information. 

To see why volume may increase with the number of traders, consider each trader’s 
preference for the lottery as having been drawn from a population of preferences. As 
the number of traders increases, the sample of preferences will contain more extreme 
observations. It is possible that increasing the number of traders increases the disper- 
sion of preferences sufficiently to cause the volume of trade to increase faster than the 
associated increase in the total endowment. To investigate the possibility that prefer- 
ences became more diverse as the number of traders increase, I investigated the behav- 
ior of the extreme buyers and sellers in each market. In the N12 period B markets, the 
top two buyers together purchased 13.3 units on average, while the top two buyers in 
the N6 period B markets together purchased only 6.8 units on average. Similarly, the 
two extreme sellers in the N12 period B markets held only 0.04 units on average, while 
- the two extreme sellers in the N6 period B markets held 1.36 units on average. Both dif- 
ferences (buyers and sellers) are significant at less than 0.001 using a Wilcoxon test (36 
observations for each test). The extreme buyers (sellers) in the N12 markets are willing 
to take larger (smaller) positions relative to their counterparts in the N6 markets. 


The Effect of the Number of Traders on the Speed on Convergence ` 


Since the prices in both the N6 and N12 markets were eventually efficient, it is 
interesting to consider whether the number of traders had any influence on the speed of 
convergence to the efficient price. This comparison was conducted at two levels. First, 
the speed of convergence over the entire two-night market was assessed, then, within 
each period A that ended near the efficient price, the speed of convergence within the 
_ period was assessed. Measuring the speed of convergence within an efficient period as- 
sesses how quickly traders learn the market signal. Measuring the speed of conver- 
gence over the entire two-night market assesses how quickly traders learn the process of 
inferring the market signal from a given market information system. The speed of con- 
vergence within a period is analogous to the ‘“‘postannouncement drift” following a 
particular earnings announcement. It measures the speed with which the market be- 


3 I also considered how the number of traders affected the relative profits earned by the informed traders 
in the N6 and N12 markets. Using the final allocation of risky and riskless assets, I computed the informed 
traders’ fraction of the expected profits (using their information) for each period A in the second-night markets. 
The informed traders earned 51.9 percent of the profits in the N6 markets and 52.7 percent of the profits in the 
N12 markets. This difference is insignificant, with a t-test yielding a p-value of 0.700. 
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comes efficient after a particular signal is realized. The speed of convergence over the 
entire market is analogous to the speed in which markets become efficient after a 
change in the information system. For example, markets may have been inefficient ini- 
tially with respect to segmental data when it was first reported, but may have become 
efficient with respect to realizations from this new information system after a few years 
of disclosure. 

The speed of convergence over the entire market was measured by fitting a regres- 
sion model of the mean absolute price deviation on the year number. Let AAD be the 
average absolute price deviation, YEAR be the year number, treating the second night 
as a continuation of the first night, and N be a dummy variable that equals 1 for N12 
markets and 0 for N6 markets. Because AAD is bounded below by zero, the fitted model 
is: 


AAD = el8ot BN) YRAR (82+ 83N). 
Taking the log of both sides gives the following linear regression: 
log(AAD)=8o+8:N+62log( YEAR)+83N ° log( YEAR)+ e«. 


The coefficients 8, and 8; represent the incremental effect on the intercept and slope, 
respectively, when the data is from an N12 market rather than an N6 market. If there is 
a statistical difference in the speed of convergence between the N6 and N12 markets, 
either 6, or 8; will be significantly different from zero. The estimated model, with 
p-values for the t-test of each coefficient given in parentheses, is: 


log(AAD) =4.52 —1.22N+.787 log{ YEAR)+.195N « log( YEAR). 
{.000) (.238) (.038) (.711) 


The estimated model indicates that neither 8, nor ĝa are significantly different from 
zero.” The number of traders in the market appears to have no significant impact on 
the speed of convergence over the entire life of the market. 

The speed of convergence within those period As that were efficient was measured 
with two different criteria for “efficient.” A period was considered sufficiently effi- 
cient if its last transaction price was within either ten or 20 francs of the efficient price. 
These criteria determined the relevant dataset and definition of an efficient price. The 
speed of convergence was then measured as the total number of trades that took place 
prior to prices’ becoming (and remaining) efficient. Figure 16 gives frequency distribu- 
tions for the number of inefficient trades in the N6 and N12 markets. As seen in the 
figure, the N6 markets converged considerably faster than the N12 markets. The N6 
markets became efficient after only 0.60 and 0.63 trades on average, using the “within 
20” and “within 10” criteria, respectively. In contrast, the N12 markets took an average 
of 3.7 trades and 7.2 trades for the “within 20” and “within 10” criteria, respectively. 
On the basis of either criterion the differences between the N6 and N12 markets are sig- 
nificant at less than the 0.02 level, with a two-tailed Wilcoxon test. 

In summary, both the N6 and N12 markets eventually converge to the efficient 
price. However, the N12 markets converge significantly slower than the N6 markets, 


3 Analysis of the residuals from this model suggested no serious model m{sspecification. The Durbin- 
Watson statistic was 2.19 (where 2 is the null value), which suggests that there was no residual first-order serial 
correlation. 
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Figure 16 


Frequency Distributions of the Number of Inefficient Trades 
for the N6 and N12 Markets 
(as a fraction of all efficient periods) 
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and the final allocations are significantly less efficient. It is possible that both of these 
differences are due to the increased dispersion of preferences in the N12 markets rela- 
tive to the N6 markets (as shown by the more extreme buyer and seller behavior in the 
N12 markets). Trade takes place in period A markets because traders have different 
beliefs and different risk attitudes. For a given difference in beliefs, however, traders 
with greater differences in preferences will be willing to transact in a larger range of 
prices. A larger range of acceptable transaction prices means that the data that other 
traders must use to draw inferences has a higher variance, which makes inferences 
more difficult and slows the convergence process. Finally, the difference in the effi- 
ciency of allocations between the N12 and N6 markets may also be due to the N12 
market’s slow convergence; trades that take place prior to full revelation of the market 
signal move the final allocation away from the efficient allocation.* 


V. Conclusions 


This study has considered how different features of a market affect its price and 
allocation efficiency. Because the study was conducted in a laboratory setting, it was 
possible to measure the market’s efficiency precisely and to manipulate different fea- 
tures of the market. Neither the precise measurement nor the manipulation would be 
possible with data from naturally occurring markets. Nonetheless, it is the naturally 
occurring markets that we are ultimately concerned about. In this section I will de- 
scribe what implications the laboratory results have for the “real world,” with it under- 
stood that such generalizations come with a host of caveats. 

When researchers examine the efficiency of markets with respect to some account- 
ing disclosure, it must be the case that they expect different traders to have different in- 
formation during the sample period; otherwise, the market would be efficient by defini- 
tion. However, the researcher’s expectations about the distribution of information 
among traders is rarely, if ever, discussed. This study indicates that the nature of the 
information asymmetry is very important. If the researcher believes that accounting 
disclosures effectively distribute different signals to different traders, perhaps due to 
differences in expectations models, then the information structure is like that of the AU 
or AC markets. Alternatively, if the researcher believes that all traders receive the same 
information from accounting disclosures, but some receive the information prior to 
others, then the information structure is like that of the N6 or N12 markets. Based on 
the comparison between the AU and N12 markets, I would expect markets with an in- 
formation structure of informed and uninformed traders to be more efficient than 
markets with all traders are informed but different traders have different information. 

Future research might structure the nature of the information asymmetry based on 
the results of recent empirical work. For instance, Bernard and Thomas (1990) find evi- 
dence consistent with a hypothesis that a large group of traders use a seasonal random 
walk to evaluate quarterly earnings announcements, but that the actual time series pro- 
cess depends on both last season’s earnings and last quarter’s earnings. The differences 


46 It is interesting to note that the N6 markets converged faster than the N12 markets even after standardiz- 
ing by the volume of trade in the period. The fraction of the total number of trades that occurred prior to being 
within ten or 20 francs of the efficient price was 0.27 and 0.26, respectively, for the N6 markets, compared with 
0.56 and 0.30 for the N12 markets. The p-values from the Wilcoxon tests are 0.04 and 0.63 for the “within ten" 
and “within 20” criteria, respectively. 
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in beliefs that arise due to these different models has a very particular structure that 
could be implemented and studied in the laboratory. 

Perhaps the most interesting result in this study is that the N12 markets took far 
more trades to reach the efficient price than the N6 markets. There was a greater 
dispersion of preferences in the N12 markets than in the N6 markets; consequently, 
there may have been a greater percentage of trades motivated primarily by preference 
differences rather than information differences. It is possible that the primarily pref- 
erence-based trades introduced noise into the market data and made it more difficult to 
discern the informed traders’ information. This result suggests that the drift in abnor- 
mal returns following earnings announcements may be larger, in terms of number of 
trades, for firms with more traders. Because large firms tend to have a relatively large 
number of active traders, this result seems incongruent with the inverse relation be- 
tween the length of the postannouncement drift and firm size documented by Foster et 
al. (1984). However, they measure postannouncement drift in calendar time, not num- 
ber of trades. Large firms also have considerably more volume than small firms. It is 
possible that large firms take more trades to converge to the efficient price than small 
firms, even though the trades are consummated in a shorter amount of time. 

The method of using an artificial economy to identify the efficient price and alloca- 
tion is very powerful. No matter how bizarre the traders’ preferences, information pro- 
cessing, or decision making under uncertainty, they also have these attributes in the 
artificial economy. Consequently, many factors that would be difficult to include in a 
formal asset-pricing model can be ignored. The ability to measure efficiency precisely, 
along with the ability to manipulate features of the economy, give laboratory markets a 
power not available to secondary data studies. 
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SYNOPSIS: The accounting profession is witnessing an increase in both 
the number of lawsuits against auditors and the settlements associated 
with those suits. As an example, partners with Laventhol & Horwath cited 
litigation claims against their firm as a major factor in the nation’s seventh 
largest accounting firm’s decision to file for bankruptcy protection. 
Disclosures by Big Eight (now Six) firms show that between 1980 and 1984 
nearly 180 million dollars were paid to settle audit-related litigation (Public 
Accounting Report 1985). An additional cost to firms associated with this 
litigation is reflected in the rise of malpractice insurance rates. For example, 
during 1984 the AICPA’s professional liability insurance plan doubled its 
insurance premiums while at the same time increasing deductibles and 
decreasing coverage (Collins 1985). Auditing firms also suffer indirect costs 
as a result of increasing litigation. Prior research (St. Pierre and Anderson 
1984; Palmrose 1988) examined audit litigation cases and provided descrip- 
tions of characteristics of auditors in those cases. Palmrose (1988) suggests 
that an increasing frequency of litigation against an auditing firm is viewed 
as a negative signal about the quality of auditing services provided by the 
firm, thereby impairing its reputation. 

Two conditions are likely to exist in order for a lawsuit to be filed 
against an auditor: (1) an allegation of audit failure, and (2) legal action 
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provides a cost-effective alternative for potential plaintiffs. This study 
hypothesizes that the client’s financial condition, asset structure, and sales 
growth affect the likelihood of erroneous financial statements being issued 
and that the auditor’s ability to detect and willingness to disclose errors are 
related to the probability of an audit failure. This study also suggests that 
the greater the market value of the client and the higher the variability of 
the client’s returns, the more likely the auditor of that client will be a target 
of litigation. 

A matched-pairs design is used to analyze a sample of companies 
involved in lawsuits against auditors and a sample of companies matched 
with the experimental sample on industry and time period. The results 
provide evidence of an association between pre-audit engagement’ charac- 
teristics of both the client and the auditor, and the subsequent filing of a 
lawsuit against the auditor. After controlling for industry effects, the ratios 
of accounts receivable and inventory to total assets, the client's variance of 
abnormal returns, financial condition, and market value are found to be sig- 
nificantly associated with lawsuits against auditors. 

A test of the model's predictive ability using various relative error costs 
and assuming various prior probabilities of auditor litigation results in 
concluding that model e ability to outperform a naive strategy is sensitive to 
the parameters selected. However, when realistic priors and error costs are 
assumed, the model is effective in identifying high-risk audit engagements. 


Key Words: Auditing, Errors, Litigation, Prediction. 


Data Availability: The data upon which this paper is based may be 
obtained from the author on request. 


HE remainder of this article proceeds as follows: Section I discusses the auditing 

environment and how elements of that environment affect the auditor’s exposure 

to litigation risk. Section II states the hypotheses to be tested and identifies the 
empirical proxies used to operationalize those hypotheses. Section III describes the 
sample selection and provides descriptive statistics for the various samples. Section IV 
contains the results of the regression analysis using various statistical methods. Section 
V presents results relating to the model’s predictive ability, and section VI presents 
brief concluding remarks. 


I. The Audit Litigation Process 


Auditing standards require the auditor to “design the audit to provide reasonable 
assurance of detecting errors and irregularities that are material to the financial state- 


t Data available after the audit engagement is accepted but prior to the lawsuit filing could be used to in- 
crease the explanatory power of the model. For example, Kellogg (1984} demonstrates that negative abnormal 
returns accrue to shareholders prior to the discovery of an alleged error. However, this information may not be 
available to the auditor either when the decision to accept the firm as a client is made or, if the client is ac- 
cepted, when the audit pricing decision is made. The analysis in this paper uses only information that is 
available to the auditor when the decision to audit for a given year is made. 
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Figure 1 
Sequence Involved in Auditor Litigation Cases 


Client/User 
Losses 
Financial i.e., Bankruptcy, 


Statements Firm, Announcement, 
Issued Stock Price Decline, 
etc. 


No 
Allegations 





ments” (SAS 53). Investors and creditors who rely on audited financial statements and 
subsequently incur losses may allege that their reliance on the audit firm’s opinion 
contributed to the losses. Brumfield et al. (1983) conclude that this risk of litigation is a 
major factor in planning an audit. Also, Simunic (1980) formally recognizes the effect 
that litigation risk has on the audit fee. When determining the appropriate audit fee, the 
auditor must assess the “present value of possible future losses which may arise from 
this period’s audited financial statements” (1980, 164). Simunic demonstrates that 
“possible future losses” are affected by the amount of auditing performed and that the 
development of an effective audit plan is contingent upon understanding those factors 
which affect ‘‘possible future losses.” 

Figure 1 illustrates a typical sequence involved in auditor litigation cases, which is 
similar to that in Palmrose (1988). First, the auditor performs an audit and issues an 
opinion on the firm’s financial statements. The audited financial statements along with 
their accompanying opinion provide users with an imperfect signal concerning the 
firm’s compliance with generally accepted accounting principles (GAAP). If subse- 
quent circumstances, such as bankruptcy or firm announcements, call into question the 
quality of the audit, financial statement users may allege an audit failure. The allegation 
may result from evidence (i.e., erroneous financial statements} indicating a substan- 
dard audit, in which case filing a lawsuit against the auditor is probable, or, because of 
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the low cost associated with naming defendants in legal actions, an auditor may be 
sued even when an audit failure is not alleged. 

Several aspects of figure 1 are worth noting. First, the allegation of an audit failure 
does not always follow the incurrence of losses by financial statement users. For exam- 
ple, losses arising from general market or industry conditions are typically not attrib- 
uted to the quality of the audit. Although audit failures are often associated with the 
issuance of erroneous financial statements, the mere existence of an error in the finan- 
cial statements is not a sufficient condition for an audit failure. The possibility exists 
that auditors may correctly apply generally accepted auditing standards (GAAS) but 
erroneous financial statements will still be issued. Arens and Loebbecke (1988, 100) 
discuss this possibility: 


Auditing cannot be expected to uncover all material financial statement misstatements. 
Auditing is limited to sampling, and certain well-concealed frauds are extremely difficult 
to detect; therefore, there is some risk that the audit will not uncover a material financial 
statement misstatement. 


In assessing the likelihood of an audit failure, the actions of the auditor in relation 
to the events in question are reviewed and an assessment of the appropriateness of the 
auditor’s decisions is made by users. For example, if the event of interest is a business 
failure, financial statement users will analyze whether or not the auditor had sufficient 
information at the time of the audit to conclude that the client had “going concern” 
problems. Because of the complexities of auditing, plaintiffs typically lack the neces- 
sary information or the required expertise to assess the appropriateness of the auditor’s 
actions. However, with the availability of class action privileges and contingent legal 
fees,” it is often cost-effective for the plaintiff to ask the courts to evaluate the quality of 
the audit and to determine if an audit failure has occurred. 

Second, note that the filing of a lawsuit does not directly follow from the allegation 
of an audit failure. As indicated, if an audit failure is alleged, a lawsuit is the most prob- 
able course of action. Nonetheless, in some instances where evidence indicates an 
audit failure has occurred, a lawsuit may not be filed. Examples of this include a lack of 
reliance on the published financial statements or the plaintiff's being guilty of contribu- 
tory negligence.* Conversely, there will be some cases where a lawsuit will be filed even 
though no audit failure is alleged. An auditor’s “deep pocket” is a typical example of 
this situation. 

The analysis of figure 1 aids in identifying factors associated with "possible future 
losses.” The probability of future losses being incurred is a function of the likelihood of 
erroneous financial statements being issued and an audit failure being alleged. The 
amount of future losses is related to the benefits perceived to be obtained by potential 
plaintiffs in pursuing litigation. 


? DeJong (1985) notes that lawsuits against auditing firms under the federal securities acts increased from 
12 suits in the ten-year period before class action privileges were extended to investors (on 1 July 1966) to over 
200 in the subsequent ten-year period. Of those 200 suits, two-thirds were class action. He also notes that con- 
tingent fees are the most common form of attorney compensation under class action privileges. 

? Although it is difficult to cite instances of lawsuits not being filed, cases of lawsuits being dismissed are 
available. For example, Vanderbilt Growth Fund, Inc. v. Superior Court (164 Cal. Rptr. 621) provides an exam- 
ple of a lawsuit being dismissed because of a lack of reliance on the part of the plaintiff. 


+ 
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The above discussion of figure 1 leads to two questions: 


1. What client or auditor characteristics are related to the occurrence of alleged 
audit failures? 

2. What client or auditor characteristics influence a financial statement user’s as- 
sessment of the benefits of legal action? 


In the discussion that follows, each of the above questions is addressed and testable hy- 
potheses are specified. 


II. Hypotheses Development 
Alleged Audit Failures 


In this section, the characteristics perceived as being related to the occurrence of 
errors in financial statements are identified. Given the descriptive nature of this study, 
these characteristics are, in turn, hypothesized to influence the probability of an audit 
failure being alleged and a lawsuit being filed. 


Client Characteristics 


SAS 47 notes that those accounts that require subjective Judgment in determining 
their value generally present an increased risk of error. In addition, those accounts that 
represent a large percentage of total assets generally present a greater risk of potential 
litigation because a small percentage error in an account with a relatively large balance 
can result in a material misstatement. 

Accounts receivable and inventory are two accounts that consistently meet both of 
the above criteria.* One or both typically comprise a major portion of total assets for 
most firms’ and both require the auditor to estimate their future value. Evidence of the 
riskiness of these two accounts is also provided in Simunic (1980, 173), who states that 
“liability exposure is thus expected to vary cross-sectionally with the relative size of re- 
ceivables and inventories in different auditee balance sheets.” Therefore, the proba- 
bility of an error occurring in the financial statements is hypothesized to be related to 
the size of these two account balances, relative to total assets. As this probability in- 
creases, the risk of litigation increases as well. Thus, the first two hypotheses (stated in 
their alternative form) are: 


H,: The ratio of accounts receivable to total assets is likely to be larger for a client 
whose audit engagement results in a lawsuit against the auditor than for a 
client whose audit engagement does not result in a lawsuit against the auditor. 

Ha: The ratio of inventory to total assets is likely to be larger for a client whose 
audit engagement results in a lawsuit against the auditor than for a client 
whose audit engagement does not result in a lawsuit against the auditor. 


The variables are operationalized (variable names A/R and INV) using data obtained 
from the balance sheet for the period prior to the one in which the error is alleged to 
have occurred. 


“ Numerous empirical studies provide support for the riskiness associated with accounts receivable and 
inventory. See, for example, Ham et al. (1985), Kreutzfeldt and Wallace (1986), Willingham and Wright (1985), 
and St. Pierre and Anderson (1984). 

* For the firms included in a control sample in this study, Accounts Receivable and Inventory comprise an 
average of 40 percent of total assets. 
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SAS 47 also recognizes that the effectiveness of the client’s internal control system 
affects the likelihood of material errors occurring in the financial statements. When 
evaluating a firm’s internal control system, auditors often analyze the transaction 
cycles of a business (Arens and Loebbecke 1988). The two most common cycles are the 
revenue/receipt cycle and the expenditure/disbursement cycle. Hylas and Ashton 
(1982) found that these two cycles accounted for 44.7 percent of detected errors. A sig- 
nificant change in either cycle could impact the control system’s ability to properly pro- 
cess transactions. Hall and Renner (1988) note that a fast-growing company whose con- 
trol system is not designed to handle the growth often results in the audit firm being 
involved in litigation. Growth is operationalized in this study by change in sales, which 
leads to the following hypothesis: 


H;: The percentage change in sales is likely to be larger for a client whose audit en- 
gagement results in a lawsuit against the auditor than for a client whose audit 
engagement does not result in a lawsuit against the auditor. 


Sales growth (variable name GROWTH) is measured as the percentage change in sales 
from period (t—1) to period t, where period t is the fiscal year preceding the year in 
which the alleged error occurred. However, if the firm or the auditor recognizes that an 
increase in the level of sales could alter the effectiveness of the internal control and 
takes appropriate measures to compensate for this factor, the likelihood of obtaining 
significant results for this factor will be reduced. 

Empirical evidence indicates that a firm’s financial condition is often an indicator 
of erroneous financial statements. Kinney and McDaniel (1989, 74) note that “‘manage- 
ments of firms in weak financial condition are more likely to window dress in an at- 
tempt to disguise what may be temporary difficulties.” Also, Kreutzfeldt and Wallace 
(1986) find that companies with liquidity or profitability problems have significantly 
more errors in their financial statements than do other companies. 

In addition to being associated with financial statement errors, poor financial con- 
ditions may also be related to the filing of a lawsuit by providing plaintiffs with an 
incentive, i.e., the incurrence of losses (see figure 1), to attempt to recover from whom- 
ever has the “deepest pocket,” such as auditors. As a result, a measure of a firm’s finan- 
cial condition is hypothesized to be related to the litigation risk faced by the auditor. 
This leads to the following hypothesis: 


H,: Financial condition is likely to be poorer for a client whose audit engagement 
results in a lawsuit against the auditor than for a client whose audit engage- 
ment does not result in a lawsuit against the auditor. 


Altman Z-scores (Altman and McGough 1974) are used to measure the clients’ finan- 
cial conditions (variable name FC), with the coefficients from Altman’s model used to 
compute a score for each client for the year prior to the year of the alleged error occur- 
rence. More recent financial distress prediction models are available (e.g., Ohlson 
1980). However, Hamer (1983) demonstrates that the various models available in the 
literature do not statistically differ in their ability to predict business failure. 


* The elements of the Z-score with their associated weightings (in parentheses) are as follows (Altman and 
McGough 1974, 52): working capital /total assets (.012), retained earnings/total assets (.014), earnings before 
interest and taxes/total assets (.033), market value of equity /book value of total debt (.006), and sales/total 
assets (.010). 
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Auditor Characteristics 


The auditor’s ability to appropriately assess the risk characteristics of clients and to 
perform adequate audit procedures will affect the probability of an error being de- 
tected. DeAngelo (19814; 1981b) identifies this ability as auditor quality and asserts that 
audit firms will provide varying levels of quality. If more resources are required in 
order to provide a higher quality audit, then larger audit firms will provide a higher 
level of quality. The following is hypothesized concerning quality: 


Hs: The resources available to an auditor are likely to be less for those audit engage- 
ments that result in a lawsuit against the auditor than for those audit engage- 
ments that do not result in a lawsuit against the auditor. 


Simunic and Stein (1986) argue that clients infer audit quality from name-brand rep- 
utation. Concerning litigation risk and audit quality, Palmrose (1988, 72) finds that 
“non-Big Eight firms as a group had higher litigation occurrence rates than the Big 
Eight.” This partition may be indicative of the resources available to firms of various 
sizes. To replicate Palmrose’s (1988) work in a multivariate setting, a “1” is assigned to 
Big Eight firms and a “0” to non-Big Eight firms to proxy for the name brand of the 
auditor (variable name NAME). 

An additional factor contributing to the occurrence of an audit failure relates to the 
tenure of the auditor/client relationship. Significant start-up costs are incurred by 
auditors in initial audits as they familiarize themselves with the client’s operations. 
Under those conditions, there is an increased risk of not detecting errors as a result of 
unfamiliarity. St. Pierre and Anderson (1984, 247) note that “learning occurs as experi- 
ence with a client increases, thereby resulting in greater efficiency in the collection and 
evaluation of evidence.” Their results support the hypothesis that the risk of litigation 
to the auditor is greater in the first years of the client/auditor relationship. The follow- 
ing hypothesis relates to auditor tenure: 


He: The tenure of the auditor/client relationship is likely to be shorter for those 
audit engagements that result in a lawsuit against the auditor than for those 
audit engagements that do not result in a lawsuit against the auditor. 


To operationalize auditor tenure {variable name TENURE), a dummy variable is 
created with a value of “0” if audit tenure is three years or less and “1” otherwise. 

The final factor related to audit failure is the auditor’s proclivity to disclose a dis- 
covered error. Watts and Zimmerman (1986) designate this factor as auditor indepen- 
dence. If the client is able to pressure the auditor into not disclosing an error, the audi- 
tor’s exposure to the risk of litigation is increased.” Because auditor switching is costly 
to the auditor,® the client can influence the auditor’s decision to disclose errors, espe- 
cially in circumstances in which there is a lack of independence. The following is hy- 
pothesized regarding independence: 


H,: The independence of the auditor is likely to be less for those audit engagements 
that result in a lawsuit against the auditor than for those audit engagements 
that do not result in a lawsuit against the auditor. 


’ The ESM Government Securities case (Sack and Tangreti 1987) illustrates how pressure from an audit 
client can result in an auditing firm compromising its independence. 

5 If the client switches auditors, the incumbent auditor loses the stream of quasi-rents that arise from the 
practice of low-balling (DeAngelo 1981a). 
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Because data relating to audit fees are not directly available, the relative size of the 
client’s sales to the total sales of all clients specific to the auditor will be used to opera- 
tionalize independence? (variable name INDEPNT). These data are obtained from Who 
Audits America. Thus, the auditor’s ability to withstand client pressure is measured as: 
[1—client’s sales/total sales of all clients]. 


Expected Benefits of Legal Action 


Since common and securities laws both require plaintiffs to incur losses in order to 
file a successful lawsuit,’° those firms whose returns vary a great deal will provide more 
opportunity for individuals to allege losses. The higher the variability in a firm’s re- 
turns, the higher the probability of large decreases (and increases") in stock price, and 
the greater the perceived benefit of legal action against the auditor. The following rela- 
tionship is hypothesized: 


Hs: The variability of abnormal returns is likely to be higher for a client whose 
audit engagement results in a lawsuit against the auditor than for a client 
whose audit engagement does not result in a lawsuit against the auditor. 


Variability of a firm’s returns (variable name VAR) is operationalized using the vari- 
ance of residuals obtained from regressing daily firm security returns against a market 
index for the six-month period preceding the period of the alleged error. 

Given that an audit failure has been alleged and damages incurred, the decision to 
file a lawsuit is then a function of the size of the award that the plaintiffs expect to re- 
ceive. The size of the award is an increasing function of the losses incurred by investors 
and since costs associated with litigation increase at a decreasing rate (Dewees et al. 
1981), the plaintiff will be most likely to pursue cases of alleged audit failure in those in- 
stances where his loss is largest. Kellogg (1984) provides evidence of a relationship be- 
tween the amount of damages incurred by plaintiffs and firm size. His data indicate a 
correlation of 0.896 (p<0.01) between client market value prior to the discovery of an 
error and the amount of damages incurred following the discovery of an alleged error. 
As a result, the following is hypothesized: 


Ha: The market value of a company is likely to be higher for a client whose audit 
engagement results in a lawsuit against the auditor than for a client whose 
audit engagement does not result in a lawsuit against the auditor. 


In operationalizing the size variable (variable name MV), the market value of the 
company as measured at the end of the period preceding the period of the alleged error 
is used. 


HI. Sample Selection and Profile Statistics 


For the period 1960 through 1985, The Wall Street Journal Index and the Securities 
and Exchange Commission Accounting Series Releases (ASR) were reviewed to iden- 


$ This surrogate assumes a positive functional relationship between client sales and the audit fee. 

© Kellogg (1984) finds six instances in which lawsuits were filed even though damages were not incurred. 
He concludes, “It is difficult to understand plaintiffs’ incentives to invest in these lawsuits. In one of these 
cases the court decided in favor of plaintiffs and awarded zero damages.” 

n Stockholders of firms with high return variances will experience just as many abnormally large gains as 
they will losses. However, the legal system does not encourage the investigation of unexpected gains. 
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Table 1 
Summary of the Sample Selection Criteria 


Identified Lawsuits 164 
Less firms with return data not available 78 
Firms with returns data available 88 
Less firms without financial statement 

information available 8 
Firms meeting data availability requirements 78 
Less financial institutions and service firms 29 
Total firms included in the sample A0 


tify cases in which auditors” were involved in litigation. Because of the noncompara- 
bility of many financial statement items, cases involving financial and service firms 
were deleted from the sample.’ Firms were also eliminated from the samples if infor- 
mation about the independent variables was not available for the one-year period pre- 
ceding the occurrence of the error. Forty-nine firms met these data requirements. The 
sample selection procedure is summarized in table 1. Table 2 provides information re- 
_ garding the sample’s distribution across time period, industry membership, and ac- 
counting firm classification. Relevant percentages from Palmrose (1988) are included 
to allow comparison of sample distributions across studies. 

Because general economic conditions (Palmrose 1987) and industry membership 
(St. Pierre and Anderson 1984) are expected to influence the probability of a firm’s 
auditor being involved in litigation, economic conditions and industry effects are con- 
trolled for by selecting (subject to data availability) two control samples. The first con- 
trol sample is matched on time period, while the second control sample (selected inde- 
pendent of the first) is matched on both time period and the three-digit SIC code. 
Control firms were randomly selected from a pool of firms consisting of all firms on the 
COMPUSTAT annual tape and annual research tape that met the matching require- 
ment and are not known to be involved in auditor litigation.” 


2 Using the same classification criteria as Palmrose (1988, 61}, only the 15 largest accounting firms are 
examined in this research. Lawsuits against auditors were identified in The Wall Street Journal Index by exam- 
ining information relating to specific auditors as well as the topics of litigation, bankruptcy, and accountants, 

4 For example, for those firms whose auditors were identified as being involved in litigation, financial in- 
stitutions have a significantly higher ratio of receivables, i.e., loans, to total assets and a significantly lower 
ratio of inventory to total assets than do manufacturing firms both evaluated at the five percent level of signifi- 
cance. In addition, financial institutions report a significantly lower financial condition when measured by the 
Z-score. This difference is not attributed to banks being in worse financial condition but rather to the fact that 
the Z-score was developed using manufacturing firms while specifically excluding banks. 

4 The year of the error occurrence was identified through information contained in The Wall Street Journal 
and is not the same as the year in which the error is reported. If errors are alleged over multiple years, the 
earliest year is identified as the year of interest. 

** To determine if the auditors of firms included in the control sample were not involved in litigation, The 
Wall Street Journal Index was reviewed for the ten-year period surrounding the year of interest for instances of 
litigation involving the auditor or the firm. This procedure cannot ensure that firms in the control sample were 
not involved in litigation. It is possible that a firm whose auditor was involved in litigation may be mistakenly 
included in the control sample. Including a litigation firm in the control sample would lead to a reduced likeli- 
hood of detecting significant differences between the two samples. 
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Table 2 


Sample of Auditor Litigation Cases Classified by Industry, 
Time Period, and Auditor Classification 
Percentages from Palmrose (1988) included for comparison 








Current Palmrose 
Industry Study (1988) 
Agriculture 0.0% 3.5% 
Mining and Construction 26.5 14.4 
Manufacturing 47.0 57.8 
Transportations, Communications, Electric, 
Gas, and Sanitary Services 12.2 11.7 
Wholesale and Retail Trade 14.3 12.6 
Total 100.0% 100.0% 
Time Period 
Auditor En Current Palmrose 
Classification 1960-1972 1973-1985 Study {1988} 
Big Eight 40.8% 36.7% 77.5% 83.7% 
Non-Big Eight 14.3 l 8.2 22.5 18.3 
Current Study 55,1% 44.9% 
Palmrose (1988) 39.4% 60.8% 


Table 3, panels A and B, provide profile statistics for the litigation and control sam- 
ples along with t-statistics and significance levels. The predicted sign indicates how the 
operational measure is hypothesized to vary with lawsuits against auditors. The uni- 
variate analysis from both panels indicates significant differences (p<0.10) in the hy- 
pothesized direction between the litigation and control samples for percentage change 
in sales, client financial condition, auditor independence, and variance in abnormal 
returns. In addition, auditor tenure is statistically significant for the sample matched on 
time period (panel A), and the ratio of receivables to total assets is significant for the 
sample matched on time period and industry (panel B). These univariate results are 
consistent with prior research in auditor litigation (St. Pierre and Anderson 1984; 
Palmrose 1987). Table 4 presents the correlation matrix for the set of independent vari- 
ables. 

The effects of industry membership can be examined by comparing the means for 
the two control samples. As shown in table 5, the results indicate that industry member- 
ship may influence the occurrence of lawsuits against auditors. Significant differences 
(p<0.05) exist for three of the nine independent variables. Thus, controlling for indus- 
try in the statistical tests that follow is important for making appropriate inferences. 


IV. Research Methods and Multivariate Results 


The nine hypotheses are tested by estimating the coefficients in the following cross- 
sectional model (the hypothesized sign is enclosed in parentheses): 


SUIT = by +b, A/R+b,INV+bs;GROWTH +b,FC +bs NAME 
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Table 4 
Correlation Matrix Among Independent Variables 
Sample matched on time period and industry 
Hi H2 H3 H4 H5 H8 H7 H8 Ho 
AIR INV GROWTH FG NAME TENURE INDEP VAR MV 
AIR 1.0 0.21 0.18 0.02 ~ 0.11 0.02 ~0.13 0.07 0.12 
INV 1.0 —0.18 0.27 —O.21 0.10 ~O.11 0.05 — 0.21 
GROWTH 1.0 0.04 0.03 — 0,12 0.11 0.25 8.03 
FC 1.0 0.09 0.07 —0.15  —0.26 0.24 
NAME 1.0 0.04 0.41 — 0.04 ' 0.35 
TENURE 1.0 — 0.15 — 0.26 0.27 
INDEP 1.0 0.18 —0.28 
VAR 1.0 —0.45 
MV 1.0 
Two-tailed test, N=98 
Cut-off for five percent significance level: +.19 
Table 5 
A Statistical Comparison of the Means of the Two Control Samples 
Means 
(Standard Deviations) 
Control Sample 
Control Sample Matched on 
Matched on Time Period t-test* 
Time Period and Industry (p-value) 
Receivables/ 0.224 0.188 2.439 
_ Total Assets (0.146) (0.080) (0.014) 
Inventory/ 0.250 0.239 0.356 
Total Assets (0.135) (0.169) (0.723) 
% Change in 0.109 0.166 — 1.535 
Sales (0.153) (0.210) (0.125) 
Financial ' 5,215 4.160 1.570 
Condition (4.141) (2.230) (0.116) 
Name Brand 0.816 0.837 — 0.274 
(0, 1) (0.387) (0.370) (0.783) 
Tenure 0.979 0.776 3.228 
(0, 1) (0.141) (0.417) (0.001) 
Independence 0.998 0.995 2.011 
(0.003) (0.010) (0.048) 
Variance of 0.00057 0.00078 — 1.055 
Abnormal Returns (0.00118) (0.00074) (0.289) 
fn(Market Value) 18.234 18.058 0.513 
(1.822) (1.568) (0.612) 


* Two-tailed tests. 
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where: 
SUIT =lawsuit (=1), no lawsuit (=0), 

(+) A/R=ratio of accounts receivable to total assets, 

(+) INV=ratio of inventory to total assets, 

(+) GROWTH =change in sales for client, 

(—) FC =the Z-score of the client, 

(—) NAME =the quality classification of the auditor (Big Eight=1, non-Big 
Eight=0), 

(—) TENURE=number of years the auditor has worked for the client (4< =0, 
3>=1), 

(—) INDEPNT=1-client sales/total sales of all clients of given auditor, 

(+) VAR =the variance of abnormal returns for the client, and 

(+) MV =natural log’® of the market value of the firm. 


The probit regression is used to estimate the multivariate model in equation DL 
The results using the control sample matched only on time period are presented in table 
6. The percentage change in sales, client financial condition, auditor tenure, auditor in- 
dependence, variance in abnormal returns, and client market value are significant at 
the five percent level and are in the predicted direction. 

To control for industry, the model was reestimated with the industry pair-matched 
control sample. In an attempt to more fully control for the effects of industry, a 
matched-pairs design is incorporated (see Bowen et al. 1981).'® The results presented in 
table 7 indicate that the ratio of inventory to total assets, client financial condition, and 
client market value are significant at the five percent level and are in the predicted 
direction. Also, the ratio of accounts receivable to total assets and variance in abnormal 
returns are significant at the ten percent level and are in the predicted direction. The 
F-statistic for the matched-pairs analysis as well as the adjusted R? of 0.30 indicate that 
the independent variables combine to explain a significant portion of the variance in 
the dependent variable. 

The results in tables 6 and 7 are consistent in showing the client’s financial condi- 
tions, variance of abnormal returns, and market value are significant. However, the re- 
sults in tables 6 and 7 also reflect the industry effects identified in table 5. Thus, it 
would appear that industry membership influences the significance of factors associ- 
ated with lawsuits against auditors. For this reason, the remaining analysis will be con- 
ducted using only the control sample matched on time period and industry. 


16 The measure of firm market value is positively skewed. A natural log transformation results in an in- 
dependent variable with a more symmetrical distribution. 

7 Noreen (1988) demonstrates that in studies involving sample sizes similar to those in this research, 
ordinary least squares (OLS) regression performs at least as well as probit. OLS results (not reported) are quali- 
tatively similar to those obtained using probit. 

'8 The matched-pairs design-involves randomly assigning the 49 pairs of firms to one of two groups. For 
those assigned to the first group (n=25), the independent variables for the litigation firm are subtracted from 
the independent variables for the control firm, and the dependent variable is assigned a value of zero. For those 
assigned to the second group (n=24), the process is reversed and the dependent variable is assigned the value 
of 1. The probit regression is then run using the difference measures. 
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Table 6 
Probit Results: Sample Matched on Time Period Coefficients, P-Values, F-Statistic, and R? 
N=98 
Coefficients 
Predicted {p-values in 
Sign parentheses)}* 

ARITA + — 0.273 
(0.568) 

INVITA + 0.423 
(0.372) 

Growth + 1.053 
(0.047) 

Financial -= — 0.180 
Condition (0.048) 
Name Brand - 2.276 
(0.997) 

Tenure —. — 1.517 
(0.018) 

Independence — ~ 323.440 
(0.001) 

Variance of + 2,725.800 
Abnormal Returns (0.033) 

fn(Market Value) + 0.269 
(0.029) 

Constant 315.740 
F-statistic 66.890! 
(0.000) 

R-squared (McFadden’s) 0.493 

Adjusted R-squared 0.441 

* All p-values are one-tailed. 


' Chi-square statistic on log likelihood ratio. 


V. Predictive Results 


In the analysis that follows, the variables tested in the previous section of the paper 
are combined in a predictive model. A methodology similar to that used by Dopuch et 
al. (1987) is employed. The model developed in this research provides the auditor with a 
score for each client or prospective client, which in this case measures the potential for 
future litigation against the auditor. The higher the score, the more likely the auditor 
will be involved in future litigation associated with the specific client. A lawsuit score 
is computed for each of the 98 firms in the sample (49 lawsuit firms and 49 control 
firms) by using the jackknife method (Altman et al. 1981). Ninety-seven firms are used 
to compute the probit coefficients, and those coefficients are used to compute the litiga- 


tion score for the ninety-eighth firm. The litigation score is then compared to that cutoff 


score that minimizes the expected cost of misclassification. The expected cost of mis- 


530 | The Accounting Review, July 1991 


Table 7 


Probit Results: Matched Pairs Design Sample Matched on Time Period and 
Industry Coefficients, P-Values, F-Statistic, and R? 





N= 49 pairs 
Coeffictents 
Predicted (p-values in 
Sign parentheses)* 
ARITA + 4.687 
(0.054) 
INVITA + 4.160 
(0.038) 
Growth , + 0.853 
(0.149) 
Financial — — 0.346 
Condition (0.018) 
Name Brand _ —0.181 
(0.384) 
Tenure _ — 0.499 
(0.133) 
Independence — ~0,222 
(0.498) 
Variance of + 487.530 
Abnormal Returns (0.052) 
fn(Market Value) + 0.439 
(0.032) 
F-statistic i 29.173! 
(0.000) 
R-squared (McFadden’s) | 0.430 
Adjusted R-squared 0.300 
* All p-values are one-tailed. 
1 Chi-square statistic on log likelihood ratio. 
classification is computed as follows: 
EC=P(NL|L) e. P(L)*C,+P(L|NL)« P(NL) » C3; (2) 


where: 


EC=expected cost of misclassification, 
P(NL|L)=probability of a type I error,” 
P(L)=prior probability of a lawsuit occurring, 
C,=cost of a type I error, 
P(L|NL)= probability of a type II error,” 
P(NL)=prior probability of a lawsuit not occurring, and 
C,=cost of a type II error. 


:? Type I error is defined as classifying observation as nonlitigation when it is a litigation firm. 
2 Type II error is defined as classifying observation as litigation when it is a nonlitigation firm. 
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The term P({NL|L) + P{L) + C, represents the expected costs, both direct and indirect, 
associated with agreeing to audit a firm that may eventually result in a lawsuit against 
the auditor. Similarly, P(L|NL) « P(NL) » C; represents the expected opportunity costs 
associated with refusing to audit a potential client that represents little actual danger of 
future litigation against the auditor.”? 

While the prior probability of a lawsuit against an auditor is not well established, 
Palmrose (1988) provides a range over which prior probabilities can be expected to 
vary. She classified, by auditing firm, 472 instances of audit-related lawsuits against 
auditors. The data indicate that Big Eight auditors were sued in roughly three percent of 
their engagements and non-Big Eight auditors were sued in approximately five percent of 
their audit engagements.” In the tests that follow, the predictive ability of the model is 
tested using priors of two, three, and four percent. Error costs are incorporated into the 
misclassification model by allowing the relative costs of type I and type II errors to vary 
from 1:1 to 30:1. The performance of the predictive model is compared to a naive 
strategy of predicting all potential clients as nonlitigation firms. 

Table 8 contains the results of the prediction analysis. The percentage of type I 
and type II errors for the various prior probability/relative error costs combinations 
indicate that as the prior probability of a lawsuit increases, few litigation 
firms are misclassified. Likewise, as the relative cost of type I to type II errors in- 
creases, the model correctly classifies more litigation firms. Comparing the model’s 
performance to a naive model indicates that with relative error costs of at least 20:1 
and prior probabilities exceeding three percent, the predictive model outperforms the 
naive model. If the prior probability of a lawsuit is assumed to be equal to three percent, 
the predictive model provides superior results when relative error costs are assumed to 
be at least 25:1. 

While it may be unreasonable for audit firms to use this model as the sole determi- 
nant in accepting or rejecting a potential audit engagement,” the results indicate that 
the model is effective in classifying high-risk audit engagements. With this information, 
auditors could adjust audit fees and audit hours to reflect the risk of litigation presented 
by a particular client. 


VI. Conclusion 


This research identifies several client and auditor characteristics as being associ- 
ated with lawsuits against auditors. Industry membership was found to influence the 
significance of characteristics associated with lawsuits against auditors. This factor 


23 As an example of how this model is used, assume the probability of a type I error is 0.3469 and the proba- 
bility of a type IT error is 0.4082 when the relative costs of type I and type II érrors is 20:1 and the prior proba- 
bility of a lawsuit is four percent. With a sample of 49 litigation and 49 control firms, the expected cost of mis- 
classification is 0.3469 x 0.04 x 20+0.4082 x 0.96 x 1=0.6694. Using a naive model (predicting all firms as no 
lawsuit) would result in a misclassification cost of: 49/49 x 0.04 x 20 —0/49 x 0.96 x 1=0.80. In this example, 
the prediction model minimizes the expected costs associated with misclassification with an expected relative 
cost of 0.837 (0.6694 /0.80). 

22 The litigation rate for each auditor classification is determined by dividing the number of suits filed 
during the period 1960-1985 by the estimated number of audit engagements performed during that same pe- 
riod (see Palmrose 1988, 64, table 3). 

23 At its maximum effectiveness, the model would suggest that the audit firm refuse 53.06 percent of those 
firms that actually present little risk of litigation (see table 8 relative error costs — 30:1, prior probability —4 
percent). 
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Table 8 


Probability of Errors and Model Performance Using Various Prior 
Probabilities and Various Relative Error Costs 


Cost of Model Errors 

Relative Costs of Relative to Cost 

Type I and Type H Prior Probability Type I* Type Dr of Errors from 

Errors of Litigation Errors Errors Naive Strategy" 
1:1 0.02 | 0.8776 0.0612 3.880 
1:1 0.03 0.8776 0.0612 2.857 
1:1 0.04 0.8776 0.0612 2.348 
5:1 0.02 0.8776 0.0612 1.478 
5:1 0.03 0.8778 0.0612 1.273 
5:1 0.04 0.8776 0.0812 1.172 
10:1 0.02 0.8778 0.0812 1.178 
10:1 0.03 0.8571 0.0612 1.055 
10:1 0.04 0.8367 0.1224 1.131 
15:1 0.02 0.8571 0.0612 1.057 
15:1 0.03 0.8367 0.1833 1.189 
15:1 0.04 0.4694 0.3469 1.025 
20:1 0.02 0.8367 0.1224 1.137 
20:1 0.03 0.4898 0.3469 1.061 
20:1 0.04 0.3469 0.4082 0.837 
25:1 0.02 0.7143 0.2041 1.114 
25:1 0.03 0.3878 0.4082 0.916 
25:1 0.04 0.2245 0.5306 0.734 
30:1 0.02 0.4898 0.3265 1.023 

30:1 0.03 0.3265 0.4286 0.788 | 

30:1 0.04 0.2041 0.5306 0.829 


* Type I error is defined as classifying observation as nonlitigation when it is a litigation firm. 
t Type IH error is defined as classifying observation as litigation when it is a nonlitigation firm. 
< Naive strategy is defined as classifying all firms as nonlitigation firms. 


was controlled by selecting a sample matched on time period and industry member- 
ship. The results indicate that the asset structure of the client, its financial condition, its 
market value, and the variability of its returns each influence the likelihood of litigation 
against the auditor. 

The predictive ability of the model was tested and found to outperform a naive 
model when realistic prior probabilities and relative error costs are assumed. This in- 
formation could aid auditors in assessing appropriate risk levels presented by clients. 
Auditors could then institute appropriate audit procedures to compensate for increased 
litigation risk and/or price their services to reflect the assessed level of litigation risk. 
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SYNOPSIS: Tax law complexity and ambiguity may result in uncertainty 
about taxable income (Slemrod 1988) and are of concern to policy-making 
‘bodies such as the ABA, the AICPA, and the IRS (Sheppard and Evans 
1990). Several studies have modeled the effects of uncertainty on taxpayer 
reporting and the role of tax practitioners in reducing uncertainty (Aim 
1988; Shavell 1988; Beck and Jung 1989a, 1989b; Scotchmer 1989a, 1989b; 
and Scotchmer and Slemrod 1989). Empirical and experimental research, 
however, have not kept pace.' 

This paper reports experimental tests of the effects of income 
uncertainty and other economic factors based on tax reporting models in 
Beck and Jung (1989a). Hypotheses were tested regarding the effects of 


' Besides the research reported herein, two recent studies have examined the impact of taxpayer uncer- 
tainty on tax reporting in an economic context. Klepper et al. (1988) attempt to test whether taxpayers having 
“questionable income sources” and who use tax advisors report a smaller percentage of “actual” income (de- 
termined by TCMP audit) than those taxpayers who do not use advisors. Their analysis, however, relies on the 

- tenuous use of the number of Revenue Rulings issued for a particular income source as a proxy for taxpayers’ 
uncertainty level. Alm et al. (1990) use laboratory experiments to examine the impact of taxpayer uncertainty 
regarding tax rates, penalty rates, and the redistribution of taxes on tax evasion in a public goods setting. Our 
study can be distinguished in that {t examines tax base uncertainty and focuses on taxpayer aggressiveness 
rather than on tax evasion. In addition to these economic studies, behavioral researchers have examined the 
impact of uncertainty on taxpayer reporting in a variety of studies (Milliron 1885; Schadewald 1989). 
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changes in the uncertainty level, tax rate, penalty rate, and audit probability 
on reported taxable income. In addition, the explanatory power of the 
models was evaluated by comparing the taxable income reported by the 
subjects with model-based predictions. 

' Subjects were endowed with a fictitious currency and. were given a 
range of possible post-audit taxable incomes from which to report. A 
proportional tax was paid on reported income and, in the event of an audit, 
a monetary penalty was imposed when the actual taxable income was 
greater than the amount reported. Incentives were provided by making the 
subjects’ post-experimental remuneration a function of the after-tax income 
retained from each experimental trial. 

Since previous theoretical research indicates that taxpayers’ reporting ` 
decisions are sensitive to risk preferences, three seperate experiments were 
performed and the results were analyzed by repeated measures ANOVAs. 
In the first and second experiments, subjects’ risk-taking attitudes were 
controlled by the Berg et al. (1986) mechanism. Risk-neutrality (risk- 
aversion) was induced in the first (second) experiment, while subjects’ 
preferences in the third experiment were measured ex post, rather than 
controlled. Two measures of taxpayer reports were employed—the actual 
income reported by subjects and the corresponding reporting fractile. 

The experimental results provided support for risk-neutral predictions. 
First, risk-neutral subjects were found to report higher levels of income 
when penalty rates and audit probabilities increased. Second, the tax rate 
did not affect the reporting behavior of risk-neutral subjects. Third, income 
reports were affected by two interactions: audit probability with uncertainty 
and penalty rate with uncertainty. Specifically, a reduction in uncertainty 
led to higher (lower) levels of reported taxable income when penalty rates 
or audit probabilities were decreased (increased). In addition, the mean 
reporting fractile did not change with the uncertainty level and the 
deviation of mean observed reports from predicted levels was small. 

For the risk-averse model, the predicted tax rate effect was marginally 
significant for reports and insignificant for fractiles. Furthermore, only a 
small percentage of the variance was explained by the interaction of tax 
rate and uncertainty. Report fractiles, however, did increase significantly as 
predicted when uncertainty was elevated. 


Key Words: Tax complexity, Experimental economics, Taxpayer aggres- 
siveness, Tax compliance. 


Data Availability: The experimental data are available from the authors on 
a high density diskette upon written request. 


HE remainder of this paper consists of the following: section I of the paper in- 
troduces the model and provides a description of the experimental setting and the 
associated theoretical assumptions. Next, comparative statics of the model are 
analyzed as the basis for developing experimental hypotheses for subjects’ tax report- 
ing behavior. This is followed in section II by explication of the experimental proce-. 
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dures employed, while the design and the results are presented in section III. The con- 
clusions are discussed in section IV. 


I. Theory and Operationalization 


Models of tax reporting under income uncertainty (Alm 1988; Scotchmer 1989a, 
1989b; Scotchmer and Slemrod 1989) have found that risk-averse taxpayers will have in- 
centives to report higher (lower) levels of income when uncertainty is increased (de- 
creased). Beck and Jung (1989a) modeled both risk-averse and risk-neutral taxpayers in 
their analysis of taxable income uncertainty. While the reporting behavior of risk- 
averse taxpayers was found to be consistent with the above studies, this was not neces- 
sarily the case for risk-neutral taxpayers. Accordingly, as discussed below, the influ- 
ence of risk preferences is recognized explicitly in the tax reporting experiments. 


The Experimental Setting 


Each session in the experiments had seven subjects who were given endowments 
(representing pre-tax income), y, of 1,000 units of a fictitious experimental currency 
(called "France" at the beginning of each of 60 trials. As a simplification, subjects’ task 
was to specify a taxable income to report, R. The actual taxable income (x) was not 
known by subjects and their uncertainty was represented by a uniform probability dis- 
tribution, f(x)=1/(H-—L), where [L, H] is the range of possible taxable income values.” 
After making each report, subjects paid a proportional tax (characterized as a 
“surcharge” in the experimental instructions) on their reported income and faced the 
possibility of being audited. 

In the event of an audit, a subject’s actual payment was based upon the audit 
outcome determined by a drawing from f(x), rather than the amount reported. Also, a 
penalty was imposed if the audit outcome exceeded the reported amount. Consistent 
with the United States tax environment, a proportional penalty rate, q, was applied to 
the tax deficiency: t(x—R), where t is the tax rate. As discussed below, both q and t 
were manipulated experimentally. Realistic economic incentives were provided by 
making subjects’ compensation dependent upon their (after-tax and penalty) disposable 
income. 

Two sources of uncertainty could affect experimental earnings and the subjects’ 
cash payoffs in our operationalization of the tax setting. The first was whether an audit 
occurred; the second was the specific taxable income drawn in the event of an audit. 
Since taxpayer reporting is the phenomenon of interest in the study, the environment 
was simplified by making the tax agency’s audit decisions exogenous. Accordingly, 
subjects in the experiments were told that audits occurred randomly? and with a known 


2 Subjects’ reports were constrained to the interval [L, H] since tax evasion was not a research issue. Asa 
practical matter, however, subjects would have no incentive to select a report, R> H, irrespective of the penalty 
rate or audit probability. A lower bound for reports could be rationalized by the presence of large (criminal) 
penalties for underpayment. While a uniform probability distribution is assumed as an operational simplifica- 
tion in the experimental setting, such a distribution is not essential to the analysis (Beck and Jung 1989a). The 
hypotheses derived require only that the cumulative distributions cross at a single point, x. Do, G(x} > F(x) for 
X<x, with the inequity reversed for x>x.). 

? An alternative would have been to make the tax agency’s audit decisions strategically based. However, 
random (non-strategic) auditing is consistent with the perceptions of a significant class of taxpayers. Empirical 
evidence suggests that taxpayers have differing beliefs about the tax agency’s approach to selecting returns for 
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Table 1 
Disposable Income of a Taxpayer Under Three Possible Events 


Event Net Payoff Probability 
1. No Audit y—tR 1—p 
R 
2. Audit/No Underpayment y-tx P| f(x)dx 
L 
H 
3. Audit/Underpayment y~tx—qt(x—R) pf f(x)dx 
R 





probability, p. The audit selection rule was operationalized by constructing a binary 
(audit/no audit) partition of the outcome space (numbers rolled on a ten-sided die) so 
that the relative frequency of occurrence corresponded with the value of p. 

Income uncertainty was made salient by employing a bingo cage containing se- 
quentially numbered balls. Since subjects knew that each ball corresponded to a par- 
ticular taxable income level, changes in uncertainty were achieved experimentally by 
varying the number of balls in the bingo cage. One bingo cage contained 11 balls to rep- 
resent a low level of uncertainty (taxable income from 700 to 800 Francs in 10-Franc 
intervals), while a second contained 51 balls to represent a high level of uncertainty 
(taxable income from 500 to 1,000 in 10-Franc intervals). This operationalization per- 
mitted the range of possible taxable income values to be manipulated, while holding the 
expected taxable income constant.‘ Table 1 illustrates the disposable income levels (net 
payoffs) corresponding to the three events described above. 

For these disposable income levels, the subject’s expected utility is given by: 


R 
EU=0-p)U(y-t)+p] us it 


> ai . 
+ | vo—tx—atix—Ryfoaasl, H 


where UI, ) denotes the utility function defined with respect to disposable income and 
all terms are as defined before. 





audit. One IRS survey (Aitken and Bonneville 1980) reports that 29.3 percent of respondents indicate that they 
believe the IRS randomly selects returns for audit. An additional 30.4 percent of respondents believe that attri- 
butes of the tax return provide the basis for audit selection. Remaining respondents either stated that they did 
not know how returns are selected (17.4 percent) or indicated that some other audit selection rule was used 
{21.7 percent). These data suggest that almost half of the respondents (46.7 percent) would act (in accordance 
with our model) as if the IRS audits randomly (combining those who stated this belief with those who are naive 
with respect to the audit decision). In any event, the evidence gathered from these experiments can provide a 
benchmark for comparisons with future studies in which the tax agency operates strategically. 

* The only means of changing the level of uncertainty given the uniform distribution assumption without 
concurrently altering the mean (consistent with the theoretical approach in Rothschild and Stiglitz 1970) is to 

manipulate the supports of the income distribution. 
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Several hypotheses were developed from a comparative statics analysis performed 
on equation DL Among the effects considered were changes in audit probability, tax 
rate, penalty rate, and taxable income uncertainty. In addition, predictions were ob- 
tained regarding the amount of taxable income reported. The utility functions U(w)=w 
and U(w)=—e”” were employed to represent risk-neutral and risk-averse preferences, 
respectively, where w denotes disposable income. The \ parameter is a positive con- 
stant denoting the degree of absolute risk-aversion (Arrow 1971; Pratt 1964). 


Hypotheses 


Assuming risk-neutrality, Beck and Jung (1989a) showed that the optimality condi- 
tion corresponding to equation (1) can be simplified to obtain the following implicit 
solution: 


F(RyJ=1-——=, (2) 
pq 
where Ry denotes the amount of the taxable income that would be reported by risk-neu- 
tral subjects. 

Since PI, ) is a cumulative probability distribution for taxable income, equation (2) 
identifies an optimal reporting fractile corresponding to Ry. The audit probability and 
the penalty rate determine the specific fractile. Since the uniform taxable income distri- 
bution can be inverted, an explicit solution can be obtained for: 


Ry=L+(H-—-L) 1-7? | | (3) 
pq 


From equations DI and DL the predicted reporting fractile, F(Rx), and the pre- 
dicted report, Ry, are both increasing functions of the audit probability (p) and the 
monetary penalty rate (q), thus suggesting the following hypothesis.* 


H1: Ceteris paribus, under risk-neutrality, the amount of reported income and re- 
porting fractile will increase as the audit probability (p) and penalty (q) in- 
crease. 


The tax rate, t, does not appear in either of the optimality conditions. Since tax rate 
changes have no effect on the optimal reporting decision, a second hypothesis is sug- 
gested. 


H2: Given risk-neutrality, the amount of reported income and reporting fractile 
will be unaffected by the tax rate (t). 


The effects of changes in subjects’ taxable income uncertainty can be examined by 
replacing the lower and upper income support by L— A and H + A, where A >0. The per- 
turbed taxable income distribution is denoted by: 


x—(L—AJ 


Bij. 
(H+A)}-(L—A) 


(4) 


t The analysis in appendix A indicates that hypothesis H1 also applies to risk-averse subjects. However, 
hypothesis H1 was not tested in the risk-averse setting due to limited financial resources. Accordingly, the 
focus of our risk-averse experiment was on the comparative static properties that differ from those under risk- 
neutrality. 
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where F*(x) has the same mean as F(x), but a wider range (higher variance). 
Substituting equation (4) into equation (3), the optimal amount of taxable income to 
be reported under distribution F*(x) would be: 


Rs=(L-4)+(H+2a-1)| -+? |. (5) 
pq 


By differentiating equation (5) partially with respect to A, one can verify that: 
SE Sa 2{1-—p) 
dA pq k 


For the special case in which (1—p)/pq=0.5, note that dR*#/dA=0, indicating that 
changes in the level of uncertainty have no effect on subjects’ reports. Substituting back 
into the optimality condition in equation DL, one can verify that this result will arise 
only when F*(R%)=0.5 (i.e., when it is optimal for subjects to report at the mean of the 
income distribution). Otherwise, the sign of 6R*¥/d0A in equation (6) will be either posi- 
tive or negative, depending upon p and q, thus implying interaction effects from 
changes in taxable income uncertainty and the penalty rate and audit probability. 
When reporting below the mean of the income distribution is initially optimal, it fol- 
lows that @R*/dA<0. This implies that an increase (a reduction) in the level of uncer- 
` tainty will result in a lower (higher) reported income, even though the expected value of 
the subjects’ tax liability remains the same. The reverse is true when the optimal re- 
porting fractile is above the mean. 

While the taxable income report is predicted to change with the uncertainty level, 
such is not the case for the reporting fractile. Substituting equation (5) for R$ into 
equation (4), one can verify that F*(R#)=F(Ry) as in equation (2), suggesting that the 
parameters do not depend on A and the optimal reporting fractile is invariant with re- 
spect to changes in the level of taxable income uncertainty. These results suggest the 
following hypotheses: 


(8) 


H3a: An increase in the taxable income range under risk-neutrality will result in a 
higher (lower) reported income when (1—p)/(pq)<(>)0.5. 

H3b: An increase in the taxable income range under risk-neutrality will not alter 
the reporting fractile. 


While hypothesis H3a predicts an interaction among uncertainty, the audit proba- 
bility, and penalty rate on reported income, hypothesis H3b predicts no effect on the re- 
port fractile. Having developed hypotheses for risk-neutral taxpayers, we now consider 
the case of risk-aversion. , 

Theoretical propositions regarding risk-aversion effects were obtained by employ- 
ing a negative exponential utility function. This choice was based on two considera- 
tions. In addition to obtaining a closed-form solution for the optimal income reporting 
level, the negative exponential utility function exhibits constant absolute risk-aversion 
which avoids a potential source of confounding that would be present if risk-aversion 
were to covary with other manipulated variables due to income effects. 

Substituting the negative exponential utility function into equation (1), one can 
show that the optimal reporting level is (see appendix A): 
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staf, DEn 
Pq 


Comparative statics results are derived in appendix A for the effects of changes in 
various model-based parameters. As in the risk-neutral model, increases in either the . 
audit probability or the penalty rate are expected to result in a higher reported income 
and reporting fractile. There are, however, important qualitative differences between 
R, and Ry. First, in contrast with the optimal response under risk-neutrality, the tax 
rate (t) now appears in expression (7) and R, is an increasing function of the tax rate 
(see appendix A). Such a finding is consistent with previous studies of risk-averse tax- 
payers (Yitzhaki 1974; Beck and Jung 1989a) and suggests the following hypothesis: 


-H4: A tax rate increase under risk-aversion will result in a higher reported taxable 
income and a higher reporting fractile. 


As in the risk-neutral case, the effect of taxable income uncertainty is analyzed by 
perturbing the range [L, H] of taxable income values while holding the mean constant. 
Once again, the effects upon both reported income and the reporting fractile are con- 
sidered. Based on the comparative statics analysis performed in appendix A, the follow- 
ing hypotheses are stated: 


— 


Ra=H 


ee 7 
(1+q)tarA d 


H5a: An increase (by an amount A) in the range of taxable income under risk-aver- 
sion will result in a higher (lower) amount of taxable income being reported 
when A is greater (less) than {[2—p(2+q)]/{(1—p)(1+q)At]—(H-—L)}/2. 

H5b: The reporting fractile chosen under risk-aversion will increase with the range 
of taxable income. 


The effects of changes in uncertainty clearly depend on subjects’ risk preferences 
and whether the focus is on reported income or the corresponding fractile. Note that, 
while risk-averse subjects are expected to increase their reporting fractile in response 
to an increase in A (H5b), no effect on the fractile is expected under risk-neutrality 
(H3b). The effects on reported income also are expected to be somewhat different in 
that A interacts jointly with the penalty rate, audit probability, and tax rate under risk- 
aversion (H5b), but only with p and q under risk-neutrality (H3b). 


II. Experimental Administration 


The subjects were 112 undergraduate and graduate students from the University of 
Illinois at Urbana-Champaign. Upon arriving, subjects were given written instructions 
(available from the authors on written request). The first part described a probability 
training exercise similar to that used by Plott and Sunder (1982). The objective of this 
exercise was to provide subjects with a knowledge of the outcome-generating prop- 
erties of the bingo cages and ten-sided dice used in the experiment. Subjects had an 
opportunity to observe the operation of the bingo cages and the dice for 40 trials. Prior 
to each draw from the bingo cage (or roll of the dice), subjects were asked to predict the . 
outcome (either X or Y, defined on partitions of the outcome space). Subjects were re- 
warded (penalized) for making correct (incorrect) predictions. 

Upon completion of the probability training session, subjects were told to read the 
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remainder of the instructions. Role playing by subjects was discouraged by not men- 
tioning taxes, audits, etc., in the instructions or during the administration of the 
_ experiment (see Davis and Swenson 1988, 20). Responses to a query in the post-experi- 
mental questionnaire suggest that subjects had not discussed the experiment with 
others prior to their participation.’ 

After completing the instructions, subjects were required to pass a quiz that tested 
their comprehension of the tax and penalty computations and the cash lottery proce- 
dure employed for utility induction. Following the quiz, subjects were informed of the 
initial parameter values to be used. 

Subjects began each experimental trial by choosing and recording a reported in- 
come level within an interval specified by the experimenter, either 700 to 800 Francs 
{low uncertainty) or 500 to 1,000 Francs (high uncertainty). After recording their report, 
subjects proceeded to an “investigation table” where a ten-sided die was rolled to deter- 
mine whether an audit was to take place. When subjects were audited, the experi- 
menter drew from the bingo cage a ball that determined the post-audit taxable income. 
This information was recorded next to subjects’ reports. Next, subjects computed their 
net earnings (in Francs) and, in experiments one and two, win-range points (see 
appendix B). They then proceeded to a “lottery table” where their computations were 
reviewed, the prize number determined, and, possibly, a cash prize awarded.® After 
completing all experimental trials, total cash payments to subjects were tallied and 
paid? while subjects completed a questionnaire designed to gather demographic infor- 
mation and evidence regarding the validity of the experiment.’ 


Ill. Experimental Design and Results 


Since subjects’ risk preferences are predicted to influence reporting behavior, it 
was necessary to recognize them in our experiments. In experiments one and two, we 
attempted to control preferences by means of the utility induction procedure described 
in Berg et al. (1986).'! The first experiment examined hypotheses H1 to H3 (risk-neutral 


* Approximately 14 percent of subjects raised the issue of tax compliance when responding to the open 
ended question: “In a few words, describe the issus that you think this experiment was trying to address.” 
While a possibility for bias was introduced in the experiment, the effect is not expected to be serious given the 
small proportion of subjects professing this belief. 

7 In a two-part question on the post-experimental questionnaire, 11 percent of subjects responded “‘yes” to 
the question: “Did you have any advance knowledge or discussions with anyone regarding this experiment?” 
However, when asked about the nature of this advance knowledge, all subjects indicated that they had heard 
about either the cash rewards available or the general nature of the task to be performed. None of the informa- 
tion obtained by subjects ex ante was deemed to be insightful enough to contaminate the experiment. 

* For the experiment in which risk preferences were measured rather than induced, the cash earned by 
subjects was a fixed proportion of their ending Francs for each trial. 

° Cash payments to subjects ranged from $29.25 to $47.73 with an average payment of $39.52 per subject 
for participation in a two-and-one-half hour experiment. 

"7 The post-experimental questionnaire was divided topically into five sections. The first section contained 
questions concerning subjects’ beliefs regarding the presence of interference by the experimenter with the 
bingo cages, dice, or the decisions of subjects. The remaining four sections dealt with the four precepts neces- 
sary for successfully controlling preferences over an experimental commodity (Wilde 1981; Smith 1982): 
salience, nonsatiation, privacy, and dominance. The results of the questionnaire suggest that subjects did not 
believe the experimenter interfered in any way and that the four precepts were satisfied. 

“= As described in greater detail in appendix B, the central feature of this approach is to map subjects’ after- 
tax disposable income (denominated in Francs) onto the probability of winning a cash prize in a lottery. Risk- 
neutral preferences are induced by ensuring that, for every additional unit of experimental currency earned by 
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behavior), while the second experiment was designed to test hypotheses H4 and H5 
(risk-averse behavior). A third experiment replicated the parameter design of the sec- 
ond experiment. However, subjects’ risk preferences were measured using the 
Kachelmeier (1989) method, rather than induced (see appendix B for details). Within 
each of the experiments, 14 subjects were randomly assigned to each experimental 
Cell "7 

In testing the comparative statics properties of the models, the effects upon both re- 
ported taxable income and the reporting fractile were considered. Reported taxable 
income provides an absolute measure of the effects of the experimental manipulations, 
while the reporting fractile provides a relative measure of behavior in that subjects’ 
reports are standardized with respect to the probability distribution for taxable 
income. Since the model-based predictions are generally the same for both reported in- 
come and fractiles, the ANOVA test results for reported income are emphasized in our 
discussion. An exception is made, however, when the support of the income distribu- 
tion is manipulated due to the divergence in the theoretical predictions."* 


Experiment One: Induced Risk-Neutrality 


The first experiment employed the Berg et al. (1986) procedure to induce risk- 
neutral preferences for testing hypotheses H1 through H3. The tex rate and uncertainty 
were manipulated at two levels within subjects—0.25/0.5 for hypothesis H2 and 
High/Low for hypothesis H3—in a fully crossed design between experiments within 
each cell. Each of the four sets of manipulation combinations was implemented for 15 
trials. The audit probability and penalty rate were manipulated respectively at three 
(0.4, 0.5, 0.9) and two (0.2, 2.0) levels between subjects, with the penalty rate manipula- 
tion nested within the audit probability. The parameter values were selected to provide 
a broad range of fractile predictions for purposes of theory testing and, as such, are not 
necessarily representative of the actual conditions faced by many taxpayers. Panel A of 
table 2 summarizes the design, parameters, and point predictions for experiment one. 

The hypotheses regarding risk-neutral reporting were tested via repeated measures 
ANOVAs. The relevant results from the ANOVAs, presented in table 3, are supportive 
of both hypotheses H1 and H2.'* The penalty and audit probability main effects are sig- 





the subject, the probability of winning the lottery increases by the same amount. By contrast, risk-aversion Is 
induced by employing a concave (negative exponential) mapping function so that the probability increases at a 
decreasing rate with each additional experimental currency unit. 

2 We initially expected that most subjects would exhibit risk-averse preferences. Thus, by replicating the 
parameter values in the second experiment, we had hoped to provide direct evidence regarding the effective- 
ness of the Berg et al. (1986) to induce risk-aversion. 

» A series of Kruskal-Wallis one-way ANOVAs using demographic data obtained in the post-experi- 
mental questionnaire suggest that random assignment to experimental treatments was successful. 

"Noto that, as the cumulative income distribution function is monotone-increasing in {ts argument, the 
reporting fractile in our model will change in the same direction as reported income. Indeed, for the uniform 
distribution case considered in our experiment, the reporting fractile is a linear function of the reported 
income level. When the support of the income distribution is perturbed, however, the relationship is poten- 
tially more complicated because both change simultaneously in the comparative statics analysis. 

‘® The trial effect in the ANOVA was insignificant (p=0.428) and was excluded from the table. In addition, 
while some interactions that incorporated the trial effect were significant, they explained almost none of the 
ee in the experiment. Hence there was no evidence that learning materially affected the subjects’ report 
choices. 
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Table 2 
Experimental Design and Parameters 


Panel A. Experiment One—Induced Risk-Neutrality: 
Probability of Penalty 


Cell Tax Rate Uncertainty Audit Rate HS R$ F(Rx) 
1 0.25/0.50 High/Low 0.50 0.20 500 700 0.00 
2 0.25/0.50 High/Low 0.40 2.00 625 725 0.25 
3 0.25/0.50 High/Low 0.50 2.00 750 750 0.50 
4 0.25/0.50 High/Low 0.90 2.00 970 794 0.94 


where High =[500, 1000], Low =[700, 800], R¥ and Ré are the expected reports for high and low levels of un- 
certainty, respectively, and F(R») is the reporting fractile which is invariant to the uncertainty level (H3b). 
The tax rate was changed every 15 trials and uncertainty was changed halfway through the experiment, at 
trial number 31, so that the design is fully crossed. The order of tax rate and uncertainty manipulations was 
reversed between experimental sessions to permit measurement of order effects. 


Panel B. Experiments Two and Three—Induced Risk-Aversion and Measured Risk Preferences: 
Probability of Penalty 


Cell Tax Rate Uncertainty Audit Rate Ra Ry F(Ra) POD, 
1 0.25 High 0.50 0.20 579 500 0.158 0.000 

| Low 700 700 0.000 0.000 

2 0.50 High 0.50 0.20 741 500 0.482 0.000 
Low 700 700 0.000 0.000 


where \=0.023, High=[500, 1000], Low=[700, 800], and R4 and Ry (POH, and F(Rx)) are the expected 
reports (fractiles) for experiments two (risk-averse) and three (risk-neutral), respectively. The tax rate and 
uncertainty level are manipulated between subjects and within subjects, respectively. The uncertainty 
manipulation is made after 30 trials, with the order of the manipulation reversed between sessions to permit 
measurement of order effects. 


nificant (p<0.001) in both ANOVAs, and together these two effects explain approxi- 
mately 52 percent and 73 percent of the variance for reported income and fractiles, re- 
spectively. Furthermore, mean income reports and mean report fractiles in table 4 
increase with the audit probability and with the penalty rate. Likewise, consistent with 
the theoretical prediction made by hypothesis H2, the tax rate effect is not significant in 
either reported income or report fractile ANOVAs (p=0.655 and 0.737, respectively), 
and the percent of explained variance provided by the tax rate factor is negligible.*® 
The results regarding the effects of taxable income uncertainty in experiment one 
(table 3) are also supportive of hypotheses H3a and H3b regarding the effect of uncer- 
tainty. Specifically, uncertainty had no impact on the mean fractiles (p =0.879) and did 
_ not explain any of the variation as predicted by hypothesis H3b (panel B of table 3). Fur- 
thermore, with reports as the dependent variable (panel A of table 3), significant inter- 
actions between uncertainty and the audit probability and penalty rate were observed. 
Both interactions were significant (p<0.001), and they accounted respectively for 19 
percent and three percent of the explained variation. Moreover, the direction of the 


"7 Note that the ANOVA indicates that the null hypothesis of no effect for tax rate cannot be rejected. While 
we cannot conclude that the analysis suggests that the null should be accepted, the extremely small percentage 
of variation accounted for by the tax rate effect suggests that it had little or no impact. 
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Table 3 
Results of Repeated Measures ANOVA for Experiment One 
(Induced Risk-Neutrality) 
Panel A. Reported Income ANOVA: 
Source of Variation df SS MS F Prob. oi 
Between Subjects 
Audit Probability 2 20,527,019 10,263,509 110.62 0.001 0.437 
Penalty Rate Nested 
within Probability 1 3,187,512 3,187,512 34.32 0.001 0.088 
Within Subjects i 
Tax Rate 1 2,438.65 2,438.65 0.20 0.655 0.002 
Audit Probability by 
Uncertainty 2 8,892,577 4,446,288 75.79 0.001 0.189 
Penalty Rate by 
Uncertainty l 1 1,315,217 1,315,217 22.42 0.001 0.028 
Panel B. Report Fractile ANOVA: 
Source of Variation df SS MS F Prob. w? 
Between Subjects 
Audit Probability 2 233.24 116.62 135.21 0.001 0.835 
Penalty Rate Nested 
within Probability 1 37.51 37.51 43.49 0.001 0.100 
Within Subjects 
Tax Rate 1 0.01 0.01 0.11 0.737 0 
Uncertainty i 0.00 0.00 0.02 0.879 0 
Audit Probability by 
Uncertainty 2 0.55 0.28 1.55 0.223 0.001 
Penalty Rate by 
Uncertainty 1 0.07 0.07 0.38 0.540 0 


Note: Some higher order interactions not reported here were statistically significant. However, none of their 
effects was theoretically predicted, and the percentage of variance explained by these effects (w° statis- 
tics) were less than one percent for any individual factor and, in sum, less than three percent for the 
ANOVAs. In contrast, the significant effects reported in this table explain almost 75 percent of the vari- 
ance for each panel. The complete ANOVA table is available from the authors upon request. 


interactions is consistent with hypothesis H3a. Figure 1 illustrates graphically the mean 
reports under the two uncertainty levels for each level of the audit probability; with a 
0.9 audit probability, the mean report is significantly greater (p<0.001) under high than 
under low uncertainty. For the 0.4 audit probability, however, the difference between 
the two mean reports is not significant. 

Figure 2 illustrates the mean reports under the two levels of uncertainty for each 
penalty rate level. The mean report under high uncertainty and a 0.2 penalty rate is sig- 
nificantly less (p<0.001) than under low uncertainty. This difference is not significant 
when the penalty rate is 2.0. 


Experiment Two: Induced Risk-Aversion 


The second experiment relied on induced risk-averse preferences consistent with a 
negative exponential utility function with \=0.023 to test hypotheses H4 and H5, using 
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Cell Means for All Experiments 


Table 4 


Panel A. Experiment One—Induced Risk-Neutrality: 
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Probability Penalty (Nested in Prob. =0.5) 
Uncertainty 0.4 0.5* 0.9 Total 0.2 2.0 Total 
L (Report) 736.94 745.21 793.66 758.60 714.01 745.21 729.64 
L (Fractile) 0.37 0.45 0.94 0.59 0.14 0.45 ` 0.30 
H (Report) 705.98 713.70 956.58 792.01 570.63 713.70 642.17 
H (Fractile) 0.41 0.43 0.91 0.58 0.14 0.43 0.28 
Total (Rep) 721.46 729.46 875.12 642.35 729.46 
Total (Fractile) 0.39 0.44 0.93 - 0.14 0.44 
Panel B. Experiments Two and Three—Induced Risk-Aversion and Measured Risk Preferences: 
Experiment Two Experiment Three 
Tax Tax 
Uncertainty 0.25 0.5 Total 0.25 0.5 Total 
L (Report) 728.38 728.52 728.45 713.01 728.49 719.75 
L (Fractile) 0.28 0.29 0.29 0.13 0.27 0.20 
H (Report) 667.78 745.70 711.74 624.64 623.91 624.27 
H (Fractile) 0.36 0.49 0.42 0.25 0.25 0.25 
Total (Rep) 703.08 737.11 668.83 675.20 
Total (Fractile) 0.32 © 0.39 0.19 0.26 


* Means presented at this level of the factor are for penalty rate=2.0 to maintain comparability across 
levels of audit probability. 


the experimental design and parameters in table 2, panel B. As indicated in the table, 
uncertainty was manipulated within subjects at two levels (high and low), each for 30 
trials. In addition, the tax rate was manipulated between subjects in a fully crossed 
design (0.25, 0.5), while the penalty rate and audit probability were fixed (at 0.2 and 0.5, 
respectively) throughout the experiment (table 2).'” Tests of risk-averse hypotheses were 
performed using repeated measures ANOVAs. 

The tax rate effect tested by hypothesis H4 is marginally significant (p<0.077) in 
the reported income ANOVA and explains 3.5 percent of the variation (panel A of table 
5). Within the ANOVA for fractiles, however, the tax rate effect is not significant 
(p=0.182) and explains only one percent of the variance (panel B of table 5). Neverthe- 
less, table 4 indicates that the mean reports and mean fractiles for the tax rate effect are 
as hypothesized—a lower mean report and lower fractile in the low-tax rate condition 
and a higher mean report and fractile in the high-tax rate condition. Likewise, the form 
of the interaction effect is as hypothesized by hypothesis H5a (figure 3} less (more) 


"7 The penalty rate and audit probability were not manipulated in this experiment since their impact on 
reporting was examined in the first experiment and theoretical predictions are not affected by risk preferences. 
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Figure 1 


. The Audit Probability by Uncertainty Interaction Effect in Experiment 1 
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The Penalty Rate by Uncertainty Interaction Effect in Experiment 1 
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Table 5 
Results of Repeated Measures ANOVA for Experiment Two 
(Induced Risk-Aversion) 
Panel A. Reported Income ANOVA: 
Source of Variation df SS MS F Prob. wi 
Between Subjects l 
Tax Rate 1 486,812.86 486,812.86 3.42 0.077 0.035 
Within Subjects 
Tax by Uncertainty 1 481,787.20 481,787.20 3.62 0.089 0.035 
Panel B. Report Fractile ANOVA: 
Source of Variation df SS MS F Prob. wW? 
Between Subjects i 
Tax Rate l 1 1.99 1.99 1.89 0.182 0.010 
Within Subjects 
Uncertainty 1 8.10 8.18 8.37 0.005 0.078 


Note: Some additional effects not reported in panel B were statistically significant. However, none of these 
effects was theoretically predicted, and the percent of variance explained by these effects (w? statistics) 
accounted for about five percent of total variance for the ANOVA. The complete ANOVA table is avail- 
able from the authors upon request. 


Figure 3 
The Tax by Uncertainty Interaction Effect in Experiment 2 
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income is reported as uncertainty decreases when the tax rate is 0.25 (0.50). However, 
the interaction predicted by hypothesis H5a is only marginally significant (p<0.069) 
and again explains a small proportion of the variation {w?=0.035). The test of hy- 
pothesis H5b regarding the effects of uncertainty on the reporting fractile provides 
the most significant result. A reduction in uncertainty leads to significantly lower mean 
fractiles (p <0.005, and w?=0.079). Given the marginal significance of all but one of the 
hypothesized effects and the very small proportion of variation explained, the support 
for the risk-averse hypotheses is not as strong as previously obtained under induced 
risk-neutrality in the first experiment. 


Experiment Three: Measured Risk Preferences 


As an alternative to inducing risk preferences, experiment three employed the 
Kachelmeier (1989) risk preference measure. Based upon our analysis, 20 of the 22 sub- 
jects who completed the measurement instrument were classified as risk-neutral and 
the remaining two subjects as risk-averse.'® In addition, consistent with the results re- 
ported by Kachelmeier (1989), the average fit of the risk-preference regressions was ex- 
cellent (mean R? was 0.89). Given the large proportion of risk-neutral subjects in this 
experiment, the following analysis is for the hypothesized effects of the risk-neutral 
model. The design and parameters employed in experiment three were identical to 
those in experiment two (panel B of table 2).’° For this design, a tax rate effect is not ex- 
pected under risk-neutrality (H2), while hypotheses H3a and H3b hypothesize a main 
effect and no effect for uncertainty in the reported income and fractile ANOVAs, 
respectively. 

The results of repeated measures ANOVAs are displayed in table 6 for reported 
income (panel A) and reporting fractiles (panel B). The tax rate effect was not signifi- 
cant in either ANOVA as implied by hypothesis H2. Furthermore, in the ANOVA for re- 


18 Six subjects in the third experiment did not complete the risk preference measurement task, which was 
administered in a separate session approximately one to two weeks after participation in the experiment. 
While all subject data were used in the analysis reported herein irrespective of risk preferences, the results of 
the ANOVAs were not affected when subjects who did not complete the risk questionnaire or who were not 
risk-neutral were excluded from the analysis. 

' The plan to make a direct comparison with tax reporting under induced risk-aversion in the second ex- 
periment (see fn. 12) had to be abandoned due to our inability to identify enough subjects with risk-averse pref- 
erences. One possible reason for this situation is that the Kachelmeier (1989) procedure may have ignored po- 
tentially risk-averse subjects (see fn. 24), while a second, suggested by the Associate Editor, is the relatively 
small reward involved vis-a-vis actual tax settings where thousands of dollars could be at stake. In order to eval- 
uate the first possibility, we computed a ratio of each subject’s price to the expected value of the underlying lot- 
tery. Under exact risk-neutrality, this ratio would be unity, whereas ratios below (above) unity would imply 
risk-averse (seeking) preferences. The average ratio for the two subjects classified by the Kachelmeier (1989} 
technique as risk-averse was 0.26, while the remaining subjects had average ratios of 0.91. The mean ratio for 
all subjects in the third experiment was 0.856. Thus, while the Kachelmeier technique appears to have been 
successful in identifying individuals who were highly risk-averse, some subjects classified as being risk-neutral 
may have been mildly risk-averse. To the extent that such risk-aversion affects subjects’ reporting decisions, we 
would expect the risk-neutral model to be less successful than in the first experiment where risk-neutrality was 
induced. In any event, the relatively small stakes involved in the risk preference measurement task as well as 
the experiment are probably responsible for our inability to identify mildly risk-averse subjects. 

2 The Mauchly (1940) sphericity index suggests that the compound symmetric covariance matrix assump- 
tion is seriously violated in the third experiment. Accordingly, the SPSSX software package that we used com- 
puted univariate average F-statistics and corrected for the violation. The Greenhouse and Gelsser (1959) ap- 
proximation reported adjusts the degrees of freedom in order to symmetrize the covariance matrix. Resulting 
probability values are approximate. 
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Table 6 


Results of Repeated Measures ANOVA for Experiment Three 
(Measured Risk Preferences), All Trials 


Panel A. Reported Income ANOVA: 


Source of Variation df SS MS F Prob. oi 
Between Subjects 
Tax Rate 1 17,043.57 17,043.57 0.09 0.771 0 
Within Subjects 
Uncertainty 1 3,828,690.71 3,828,690.71 40.86 0.001 0.232 
Trial 29 393,728.27 13,576.84 
Greenhouse-Geisser 4.26 5.01 0.001 0.020 
Uncertainty by Trial 29 195,187.34 6,730.60 
Greenhouse-Geisser 3.84 2.62 0.042 0.008 


Panel B. Report Fractile ANOVA: 


Source of Variation df SS MS F Prob. oi 
Between Subjects . 

Tax Rate 1 1.87 1.87 0.79 0.384 0 
Within Subjects 

Uncertainty 1 1.09 1.09 3.36 0.079 0.006 

Tax by Uncertainty 1 1.95 1.95 5.99 0.022 0.014 

Order by Uncertainty 1 5.04 5.04 15.48 0.001 0.039 

Trial 29 4.47 0.15 

Greenhouse-Gelsser 4.71 6.93 0.001 0.033 


Note: The probability estimates in table 6 are based on the Greenhouse and Geisser (1959) adjustment, and 
thus are approximate. The complete ANOVA table is available from the authors upon request. 


ported income, the main effect for uncertainty was significant (p<0.001, with an w? of 
0.232) consistent with hypothesis H3a. The main effect for uncertainty in the fractile 
ANOVA (H3b) was marginally significant (p<0.079) but explained almost none of the 
variance in the data (w?=0.006). Finally, in both ANOVAs, an experimental trial effect 
was observed as shown in figure 4. The trial effect pattern suggests that learning may 
have occurred during the first half of each treatment condition. 

In an attempt to remove suspected learning effects from the analysis, a second set 
of ANOVAs was performed using subjects’ last 15 trials under each treatment condi- 
tion. The results presented in both panels of table 7 indicate that the trial main effect 
was no longer significant and explained virtually none of the variance. Likewise, the 
uncertainty effect hypothesized by hypothesis H3a explained approximately 36 percent 
of the variance in panel A of table 7 and is more significant than the result presented in 
table 6. Furthermore, the tax rate remained insignificant as hypothesized by hypothesis 
H2 and explained none of the variance. In the fractile ANOVA (panel B of table 7), none 
of the effects in the ANOVA was significant, EE providing additional support for 
hypotheses H2 and H3b. 


Overall Explanatory Power 


Overall explanatory power was evaluated by calculating the mean deviations be- 
tween observed fractile reports and the corresponding model-based reporting fractiles. 
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Table 7 
Results of Repeated Measures ANOVA for Experiment Three 


(Measured Risk Preferences), Last 15 Trials in each Treatment Condition 
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Panel A. Reported Income ANOVA: 


Source of Variation df 
Between Subjects 

Tax Rate 1 
Within Subjects 

Uncertainty 1 

Trial 14 

Greenhouse-Geisser - 2.94 
Uncertainty by Trial 14 


Greenhouse-Geisser 2.70 


Panel B. Report Fractile ANOVA: 


Source of Variation df 
Between Subjects | 
Tax Rate 1. 
Within Subjects 
. Uncertainty 1 
Tax by Uncertainty 1 
Trial 14 


Greenhouse-Geisser 4.12 
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Figure 4 
The Trial Main Effect in Experiment 3 
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Table 8 
Mean Deviations from Predicted Fractile Descriptive Statistics 





Induced Induced Measured 
Statistic Total Risk-Neutrality Risk-Aversion Risk Preferences 
Minimum — 0.369 0.3898 —0.119 0.000 
Maximum 0.471 0.471 0.537 0.714 
Mean 0.130 0.052 0.194 0.223 
Standard Deviation 0.147 0.116 0.009 0.122 
Skewness 0.382 0.227 — 0.012 1.388 


Kurtosis . 0.617 0.479 —0.510 3.493 





Descriptive statistics for the distribution of deviations are presented in table 8 for 
each of the three experiments and on an overall basis. 

While subjects’ actual reporting fractiles were on average 13 percent higher than 
the model-based reports, there was considerable variability among the three experi- 
ments. In experiment one the mean deviation was much smaller than in the other 
experiments. The relatively lower explanatory power of the model in the second experi- 
ment could be attributed to difficulties in inducing preferences consistent with a nega- 
tive exponential utility function.” The large deviation from predicted levels in the third 
experiment may have been an artifact of the experimental parameters. Unlike the other 
experiments where several different fractiles were optimal, in experiment three the 
lower support (zero fractile) was optimal under all experimental conditions. This could 
have created a possible floor effect and upward skewness among the observed reports. 


IV. Conclusions 


Our experimental findings indicate that the subjects reported more taxable income 
than implied by the tax reporting models. The risk-neutral model generally provided 
more accurate predictions than the risk-averse model, and the performance of the risk- 
neutral model was enhanced when the Berg et al. (1986) method was employed to con- 
trol risk preferences. However, more research will be required to determine whether 
these conclusions can be generalized. 

Despite the difficulties encountered in providing accurate expectations regarding 
the amounts of income reported by subjects, the comparative statics properties of the 
models were supported. In particular, we found that increases in the penalty rate and 
audit probability resulted in significantly higher levels of taxable income being re- 
ported as hypothesized. Furthermore, changes in the uncertainty level interacted with 
the penalty rate and audit probability. 

Several limitations associated with the research require recognition and discus- 
sion. First, in order to make economic incentives dominant as is customary in experi- 
mental economics work, other noneconomic factors such as societal norms were sup- 
pressed (see Spicer 1986 for a discussion of the importance of an interdisciplinary 


u Given a successfully induced negative exponential function, subjects would have more to lose by report- 
ing too low than by reporting too high. This suggests that their sensitivity to over-reporting would be dimin- 
ished, leading to a higher positive observed deviation than under risk-neutrality. 
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approach to studying tax reporting}. Second, in the real world, taxpayers would not 
necessarily know the underlying probability distributions. However, they would have 
an opportunity to acquire information from tax practitioners. No such opportunity was 
provided in our experiments. Third, the manipulation of taxable income uncertainty 
was constrained by the assumption of a uniform distribution. Fourth, the tax agency’s 
audits occurred with a fixed probability and were independent of taxpayers’ reports.” 
Finally, our results in all three experiments are joint tests of the effectiveness of our 
methods for dealing with risk preferences and theory. Future experimental research 
could incorporate directly the role of practitioners in the tax reporting process building 
upon recent analytical work (Beck et al. 1990a; Reinganum and Wilde 1989; Scotchmer 
1989a, 1989b). Another extension is to include labor supply (Collins and Plumlee 1991) 
and noneconomic factors. 


Appendix A 
Negative Exponential Utility Model 


The first-order condition characterizing the optimal reporting decision for a taxpayer having a negative 
exponential utility function is: 


1—p m 1 4 Qe (Ra)! dy, i (1A) 
pq H-LJda, 

recognizing that gU POR d! = gta e gros, equation (1A) can be rewritten as: 

= ee f enwa dy, (2A) 
A 
e> trei ged) ert ite} cf — et ie) ech ani (3A) 
(1+q)^t 
AUDIR, e UDEL] (4A) 
(1+q)At 
Rearranging equation (4A), we obtain the following expression: 

[((H—Ly1—p)(1+q)rt/(pq)i+1 =g AUD Ry (5A) 


gòl lte)ıH 


Taking the natural logarithm of both sides of expression (5A) and simplifying, we obtain: 
grad 
A(1+q)tR, = fn < en (6A) 
[((H-L}1—p)A(1+q)t/(pq)]+1 
Solving expression (6A) for R, and simplifying, one can verify that: 


Ree „mfia( 2-1) (142) er-Lye} (7A) 
(1+q)tr p q 


The effect of an increase in the audit probability (p) is obtained by differentiating expression (7A) using the 
chain rule: 


OE sien A EE (142) arn" . J- (+) (t)ar too. (8A) 
dp Droit p q p? q 


22 In a subsequent study, Beck et al. (1990b) have extended the present experimental setting to permit strate- 
gic interaction between taxpayers’ reporting decisions and the tax agency’s audit policies. Their study tests the 
game-theory model presented in Beck and Jung (1989b). l 
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Similarly, expression (7A) can be differentiated with respect to q to determine the effect of increasing the 
penalty rate: 


E dl E (1+2)4-Ln} +4 
ðq = (1-+q)*trA p q (1+q)tr 


eG eme) n 


The tax rate effect can be obtained by differentiating expression (7A) with respect to t: 


oR. moe infs(4-a) baue ee: 
dt = (1+q)At? p q (1+q]trA 


; DE (1+2)oH—19e]" i dE Donn (10A) 
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Letting 8={1+{1/p)- 1i(1/q4)+1]t{H—L)A}, expression (10A) is rewritten as: 








ob, 1 1 
ae | eS he) 1 
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Denoting the bracketed terms by T, 
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since 8>1 and 68/dt=[(1/p)—1][(1/q)+1)(H-—L)A>0. 
Note also that T=0 when t=0. It thus follows from the latter and expression (12A) that: 


T>0 forall t>0. (13A) 


Since T>0 and (1+q)At?>0, it must be that 3R4/8t>0. Also note that, since F(R,) is an increasing function 
of R,, the optimal reporting fractile also will increase with the tax rate. 

The effects of taxpayer uncertainty about the tax lability can be modeled by perturbing the support of the 
taxable income distribution so that x is uniformly distributed over the interval [L—A, H+A], where A 
denotes the perturbation parameter. Substituting L—A for L and H+A for H into expression (7A) we obtain 
the following optimal report corresponding to the perturbed income distribution: 


1 1 1 
= = of oe —)(H+2a—L)tr}. 
Rt=(H+A) SES nji+(2 1) (142) +2 ol (14A) 


The effects of increases in the level of uncertainty (range of taxable income values) can be determined by dif- 
ferentiating expression (14A) partially with respect to A: 


ORT q 1, DER (: Asa E ER (1+2) a} (15A) 
aA (1+q)td p q p q 


SE , (16A) 
pq+(1—p)(1+q)(H+2A4—L)tarA 
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Note that a necessary and sufficient condition for the right-hand side of expression (17A) to be E D is 
that: 


{17A) 


,__2—p2+q) _(H-L) 
2(1—p)(1+q)At 2 
A direct implication of expression (18A) is that the effects of uncertainty are expected to interact with the 


audit probability, penalty rate, risk aversion (Al, and the tax rate. Ceteris paribus, as the tax rate or risk avar- 
sion increases, expression (18A) indicates that an increase (reduction) in uncertainty will be more likely to 


(18A) 
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increase (decrease) reported income. The effect of uncertainty changes on the reporting fractile can be 

determined analogously. Substituting equation (14A) into equation (4), 
Ri—(L-A) 
H+24—L ` 


The effect of changing the level of uncertainty can be determined by differentiating expression (189A) partially 
with respect to A: 


OF*(R4) Tf ORR A~L)—2(Rt-(L—A |. 20A 
A (A ( Ce lge )—2(Rł-( ) (20A) 
Making use of equations (16A) and (14A), equation (20A) can be written as: 
aF*(R4) _ 2 , f en (21210) J(t) t, (21A) 
dA (H+2A—L)?[{1+q)tA] DO pqtJ(t) 


where J({t)=(1-~—p)(1+q)H+2A—L)th. 

Since the first two product terms on the right-hand side of equation (21A) are positive, the sign of equa- 
tion (21A) will depend solely on the third term enclosed by braces. Letting Z(t}=én{{pq+J)/(pq))—]/(pat+]), 
one can verify that: 


F*(Rt= (19A) 


SE pq H Ss Lt 

dt pq+} / \ pq (pq+J)? 

=) >o. (22A) 
E 


Note that when t=0, J(t)=0, so that Z(t)=?n(pq/pq)=0. Therefore, Z(t)>0 for all t>0, and it thus fol- 
lows from equations (21A) and (22A) that éF*(R4)/d4>0 for t>0. Hence, in contrast to reported income, the 
optimal reporting fractile always increases with A, irrespective of the tax rate. Thus, an increase in uncer- 
tainty (Income range) will result in a higher fractile being reported, while a reduction in uncertainty will have 
the opposite effect. A corollary comparative statics property of the optimal reporting fractile is that there is 
no predicted interaction between the tax rate and uncertainty as was the case for reported income. 





Appendix B 
Control over Risk Preferences 


Preference Induction in Experiments 1 and 2 


The procedure to induce risk preferences in our experiments made use of three ten-sided dice and “win 
range sheets” such as the one displayed in figure 5. Subjects were provided with win range sheets to make 
salient the mapping between after-tax (and penalty) Francs and numbers in the interval [0,999] which repre- 
sented the probability (to three digits) of winning a 75-cent cash prize in a lottery. 

The win range sheet in figure 5 illustrates a linear mapping of possible ending Francs onto probabilities, 
thereby inducing risk-neutrality. That is, a unitary increase in the subjects’ ending total in Francs results in 
an equal increase in the probability of receiving cash payments. An analogous procedure was also employed 
in the second experiment to induce risk-aversion by substituting a concave (negative exponential) mapping 
function. In this case, a unitary increase in ending Francs leads to ever diminishing marginal increases in the 
probability of receiving a cash payment (win range). Consequently, subjects’ preferences (and decisions) with 
respect to the experimental currency outcomes were theoretically consistent with those that would be ob- 
served under risk-aversion (negative exponential utility functions defined directly over the experimental cur- 
rency space). i 

To participate in the cash prize lottery, subjects first determined their after-tax Francs and the associated 
win range points. Then the experimenter rolled three ten-sided dice. The outcome of each die represented 
one digit of a three-digit number (the “prize number’’). The order of the digits in the number was determined 
by the color of each die. If the prize number was less than or equal to the win range points, then the subject 
won a cash prize. However, when the prize number was greater than the win range points, the subject won 
nothing. 
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Figure 5 
Sample Win Range Sheet for 0.25 Tax Rate, 0.20 Penalty Rate, 
and Low Uncertainty 
WIN RANGE SHEET SN 
Ending Total Win Increase in 
in Francs Range Win Range 
795 0 0 
796 33 33 
797 67 34 
798 100 33 
799 133 33 
800 167 34 
801 200 33 
802 233 33 
803 267 34 
804 300 33 
805 333 33 
806 367 34 
807 400 33 
808 433 33 
809 467 34 
810 500 33 
811 533 33 
812 567 34 
813 . 600 33 
814 633 33 
815 667 34 
816 700 33 
817 733 33 
818 767 34 
819 800 33 
820 833 33 
821 867 34 
822 900 33 
823 933 33 
824 967 34 
825 , 999 32 





Risk Measurement in Experiment 3 


Since the experiments were, by necessity, a joint test of subjects’ risk preferences and our model, and 
given the potential importance of subjects’ risk-taking attitudes for the theoretical predictions, an alternative 
to the Berg et al. (1986) mechanism was employed in the third experiment reported herein. Specifically, we 
attempted to estimate subjects’ risk preferences using a refinement of the Becker et al. (1964) technique sug- 
gested by Kachelmeier (1989). This approach first required assessment of subjects’ certainty equivalents for a 
series of simple lotteries (each offering some probability of winning $100). The certainty equivalents were 
assessed as the maximum price at which a subject would be willing to pay for a chance to play the lottery in a 
second-price auction. To provide real financial incentives for subjects in this task, after all subjects had indi- 
cated a maximum purchase price, one was chosen at random to participate in a real lottery. The real lottery 
was selected at random from the series of lotteries upon which subjects bid and was scaled so that the pur- 
ae price charged to the chosen subject was 20 percent of their indicated price” and the available prize was 

0. 


3 The expected cash payment for subjects in the lottery was of the same magnitude as expected cash earn- 
ings in one trial of the experiment. The expected cash payment for the lottery was $0.45, computed as 1/22 
(chance of being selected to participate) x 0.5 (mean chance of winning) x $20 (prize). 
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Subsequently, the certainty equivalents were employed (along with experimenter-imposed anchors of $0 
for a lottery that provided no chance of winning $100 and $100 for a lottery in which winning $100 was cer- 
tain) to estimate a quadratic OLS regression of the form: 


Prob,=a+8,(Price,)+ 8.{Price,)?+ Ee (1B) 


where Prob, was the exogenous probability of winning lottery i, Price, was the subject’s minimum selling 
price for lottery i, and a, 8,, 82, and e were the regression coefficients and residual, respectively. While the 8, 
coefficient should be positive for all subjects, a negative (positive) 8. will reflect risk-averse (seeking) 
preferences. Accordingly, subjects were classified into risk-averse, risk-neutral, and risk-seeking categories 
based on the two-tailed t-statistic for the test of the null hypothesis §.=0. In particular, those having signifi- 
cantly negative 8, coefficients (at p<0.05) were identified as risk-averse, while those with significantly posi- 
tive (insignificant) coefficients were labeled as risk-seeking (risk-neutral). 


*4 The above decision rule for classifying subjects as being risk-averse is potentially conservative in that it 
does not take into account other manifestations of risk-aversion affecting the a and £, coefficients in the regres- 
sion. In order to overcome this potential problem, we employed a second approach (see fn. 19). 


References 


Aitken, S., and L. Bonneville. 1980. A General Taxpayer Opinion Survey. Prepared for the Office of 
Planning and Research, Internal Revenue Service (March). 

Alm, J. 1988. Uncertain tax policies, individual behavior, and welfare. American Economic Review 
78 (March): 237-45. 

——, B. Jackon, and M. McKee. 1990. Institutional uncertainty and taxpayer compliance. Working 
paper, University of Colorado at Boulder. 

Arrow, K. 1971. Essays in the Theory of Risk-Bearring. Homewood, IL: Markham. 

Beck, P., J. Davis, and W. Jung. 1990a. The role of tax practitioners in tax reporting: A signalling 
game. Unpublished manuscript (July). 

my , and ———. 1990b. Taxpayer aggression and tax complexity under strategic and non- 
strategic audits: Experimental evidence. Unpublished manuscript (November). 

, and W. Jung. 1989a. An economic model of taxpayer compliance under uncertainty. Journal 
of Accounting and Public Policy 8 (Spring): 1-27. 

——, and . 1989b. Taxpayer’s reporting decisions and auditing under information asym- 
metry. The Accounting Review 64 (July): 468-87. 

Becker, G., M. Degroot, and J. Marschak. 1964. Measuring utility by a single-response sequential 
method. Behavioral Science 9 (July): 226-32. 

Berg, J., L. Daley, J. Dickhaut, and J. O’Brien. 1986. Controlling preferences for lotteries on units of 
experimental exchange. Quarterly Journal of Economics 101 (May): 281-306. 

Collins, J., and R. Plumlee. 1991. The taxpayer’s labor and reporting decision: The effect of audit 
schemes. The Accounting Review (July): 559-76. 

Davis, J., and C. Swenson. 1988. The role of experimental economics in tax policy research. Journal 
of the American Taxation Association 10 (Fall): 40-59. 

Greenhouse, S., and S. Geisser. 1959. On methods in the analysis of profile data. Psychometrika 24 
(June): 95-112. 

Kachelmeier, S. 1991. Experimental assessment of monetary risk preferences. Auditing: A Journal 
of Practice & Theory (Forthcoming). 

Klepper, S., M. Mazur, and D. Nagin. 1988. Expert intermediaries and legal compliance: The case 
of tax preparers. Working paper, Carnegie Mellon University (May). 

Mauchly, J. 1940. Significance test for sphericity of a normal n-variate distribution. Annals of Math- 
ematical Statistics 11{June): 204-09. 

Milliron, V. 1985. A behavioral study of the meaning and influence of tax complexity. Journal of 
Accounting Research 23 (Autumn): 794-816. 

Plott, C., and S. Sunder. 1982. Efficiency of experimental security markets with insider informa- 
tion: An application of rational expectations models. Journal of Political Economy 90 (August): 
663-98. 

Pratt, J. 1964. Risk aversion in the small and in the large. Econometrica 32 (January-April): 122-36. 











558 The Accounting Review, July 1991 


Rothschild, M., and J. Stiglitz. 1970. Increasing risk I: A definition. Journal of Economic Theory 2 
(September): 225-43, 

Schadewald, M. 1989. Reference point effects in taxpayer decision making. Journal of the American 
Taxation Association 10 (Spring): 68-84. 

Scotchmer, S. 1989a. Who profits from taxpayer confusion? Economics Letters 29 (February): 49-55. 

. 1989b. The effect of tax advisors on tax compliance. In Taxpayer Compliance: Social Science 

Perspectives, Volume 2, edited by J. Roth and J. Scholz, 156-81. Philadelphia: University of 

Pennsylvania Press. 

, and J. Slemrod. 1989. Randomness in tax enforcement. Journal of Public Economics 38 (Feb- 

ruary): 17-32. 

Shavell, S. 1988. Legal advice about contemplated acts: The decision to obtain advice, its social 
desirability, and protection of confidentiality. Journal of Legal Studies 17 (January): 123-50. 

Sheppard, L., and M. Evans. 1990. Simplification means tough choices, AICPA/ABA conferees 
agree. Tax Notes 46 (January 22): 381-87. 

Slemrod, J. 1988. Complexity, compliance costs, and tax evasion. Why People Pay Taxes: A Social 
Science Perspective. National Academy of Science. Forthcoming. 

Smith, V. 1982. Microeconomic systems as an experimental science. American Economic Review 
72 (December): 923-55. 

Spicer, M. 1986. Civilization at a discount: The problem of tax evasion. National Tax Journal 39 
(March): 13-20. 

Wilde, L. 1981. On the use of laboratory experiments in economics. In The Philosophy of Eco- 
nomics, edited by J. Pitt, 137-48. Dordrecht, Germany: Reidel. 

Yitzhaki, S. 1974. A note on income tax evasion: A theoretical analysis. Journal of Public Eco- 
nomics 3 (May): 201-02. 








THE ACCOUNTING REVIEW 
Vol. 66, No. 3 


July 1991 
pp. 569-576 


The Taxpayer’s Labor and Reporting 
Decision: The Effect of 
Audit Schemes 


Julie H. Collins 
University of North Carolina at Chapel Hill 
R. David Plumlee 


Kansas State University 


SYNOPSIS: Individuals failed to report between $70 and $79 billion in fed- 
eral taxes due on legal income received in 1986. This amount is equivalent 
to approximately 20 percent of federal income taxes due and 40 percent of 
the federal deficit in that year. Including underreporting of organizations 
and of those who receive illegal income likely pushes the annual amount of 
taxes due, but not paid, above the $100 billion mark (Roth et al. 1989, 1). 
Thus, reporting income for tax purposes may be characterized by a high 
degree of inaccurate self-reporting with an immediate economic impact. 

This problem can be countered by auditing self-reported income. An 
audit scheme is the approach by which a taxing authority chooses the self- 
reports to be audited. This paper describes an experiment that examines 
the effect of three audit schemes on taxpayers’ joint or related decisions 
about the level of labor to be supplied and the amount of income (if any) to 
underreport. The three audit schemes differ principally in the information 
used by the taxing authority to determine which self-reports of income to 
audit: (a) no information is used—reports are chosen strictly at random; (b) 
reported income is the basis for choosing the reports to be audited; (c) an 
estimate of true income is used in addition to reported income to select 
audit cases. Also examined is the impact of alternative tax rates and 
penalty levels on earned and underreported income. 


The authors gratefully acknowledge the helpful comments of workshop participants at the University of 
Florida, University of Kansas, Kansas State University and the Southeast AAA Doctoral Consortium. In 
addition, we ere particularly indebted to Louis L. Wilde, and four anonymous referees for their extensive 
feedback; Jean Cooper for sharing her computerized work task; and Daniel Murphy and Charles Davis for their 
capable research assistance, Finally, the authors thank the University of North Carolina Business Foundation 
for financial support. 


Submitted January 1990. 
Revisions received August 1990, January and February 1991. 
Accepted February 1991. 


559 


560 


The Accounting Review, July 1991 


Traditional theoretical! work shows that income taxes alter the amount 
of labor supplied (e.g., Flanagan et al. 1984, 143-5). However, predictions 
regarding the direction of the change in labor supply are ambiguous: the 
labor supplied may increase as the taxpayer tries to compensate for the 
income lost to taxes by working more (the income effect), or the amount of 
labor may drop as alternatives to work become more economically 
attractive (the substitution effect). Swenson (1988) experimentally 
examined labor supply in response to varying marginal tax rates, and the 
overall results are consistent with the substitution effect dominating the 
income effect. - 

The taxpayer’s labor response to the tax system may not be deter- 
mined solely by the direct effect of tax rates. Once a tax system is imposed, 
those electing to underreport their actual earned income may also elect to 
increase their labor supply as a result of a diminished tax substitution effect 
in the presence of underreporting. If the reporting decision directly affects 
the labor supply decision, then any tax system parameter (e.g., penalty and 
audit) affecting the reporting decision will also impact the labor supply de- 
cision. 

Consistent with the reasoning above, Pencavel (1979) analytically por- 
trays the taxpayer’s response to the tax system (i.e., rates, penalty, and 
probability of audit) as a joint or related decision about labor supply and the 
income reported. In a similar vein, Atkinson and Stiglitz (1980, 27-8) refer 
to the impact of taxation (and evasion opportunities) on. labor supply in 
terms of income, substitution, and “financial” effects. They explain that fi- 
nancial effects arise when “the same real activity can correspond to several 
different forms of payment, which are taxed at different rates.” The most 
extreme example offered of the financial effect on labor supply is the 
underground or “hidden” economy. Taxpayers may restructure their activi- 
ties or transactions so as to operate in an economic environment free of 
taxation. 

Recognition of the jointness of the labor and reporting decisions 
suggests that it is reasonable to incorporate both variables in empirical 
studies to test how the taxpayer responds to the tax environment. In 
another study, Collins et al. (1990) found evidence that a positive relation- 


ship exists between work effort and noncompliance opportunities. In 


general, subjects with a noncompliance opportunity worked harder than 
those without an underreporting opportunity. 

The effects of the tax environment (e, audit schemes, tax rates, and 
penalty levels) on income earned and underreporting of income are tested 
in a laboratory setting because well controlled real world setting counter- 
parts are not available. The experimental design consists of three audit 
schemes and two levels of tax rate and penalty resulting in 12 between-sub- 
ject conditions. The subject’s task is a computerized letter decoding task.. 
During the experiment,.each subject participated in a practice work session 
and four five-minute actual work sessions. At the end of each work 
session, subjects self-reported the income earned and corresponding tax 
liability. One of the three audit schemes was followed to select reports to 
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verify. The reports denoting whether audit selection had occurred were re- 
turned privately to the subjects before the beginning of the next work ses- 
sion. Subjects were compensated based on the actual income earned less 
the taxes self-reported and possible penalties (proportional to the tax 
deficiency) imposed as a result of the audit. 

The subjects’ underreports of income and work effort are analyzed for 
each work session separately and across work sessions two through four 
together using a multivariate analysis of covariance (MANCOVA). The 
number of letters decoded by each subject in the practice work session is 
included as the covariate. This measure serves as a surrogate for ability. 
Subsequent analyses of covariance (ANCOVAs) and mean comparisons are 
performed for each dependent variable. Verification schemes that incor- 
porate the preliminary information signal sent by the taxpayer are more suc- 
cessful overall in curbing underreporting than purely random audit models. 
Also, underreporting is generally greater when tax rates are high and pen- 
alty levels are rather low. Finally, underreporting and effort are positively 
related. Those subjects electing to underreport also produce significantly 
more income. 


Key Words: Tax compliance, Taxpayer self-reporting behavior, Labor sup- 
ply, Auditing schemes, Behavioral experiment. 


Data Availability: Data are available on request from the authors. 


DESCRIPTION of the research is organized as follows. The first section pro- 

vides a background for the design of the study by reviewing relevant previous 

research and providing conjectures regarding the expected effects of the audit 
schemes, tax rates, and penalty levels on underreported income and income actually 
earned. The next section examines in detail the experimental methods. The analysis of 
the data and a discussion of the results appear in the following section. Finally, a sum- 
mary of the study, as well as directions for future research, are provided. | 


I. Tax Compliance, Labor Supply and Audit Schemes 
Three Audit Schemes 


Unlike the recent analytical models that model the behaviors of both the taxpayer 
and the taxing authority (Graetz et al. 1986; Reinganum and Wilde 1985, 1986), we focus 
only on how the taxpayer’s underreporting behavior and work effort respond to differ- 
ent audit schemes (in conjunction with other tax environment characteristics). There- 
fore, we fix at a constant level the total amount of auditing that the taxing authority can 
employ in each scheme. With the assumption of particular audit cost parameters, this 
study could be considered somewhat analogous to the situation in the models refer- 
enced above where audit resources are fixed by a binding budget constraint at a level 
below the equilibrium. There is evidence that this may be the case in the United States 
today (Graetz et al. 1984, 3—4). 

Audit schemes found in the tax compliance literature might be classified as ran- 
dom, cut-off, and conditional (Graetz et al. 1986; Reinganum and Wilde 1985, 1986). 
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The random audit scheme simply provides each self-report of income an equal chance 
of being chosen for verification by an audit; no information is used to select the reports 


to be audited. The cut-off and conditional audit schemes incorporate the preliminary ` 


information transmitted when a taxpayer self-reports income and the corresponding 
tax liability. Thus, the taxing authority is able to observe the reported income signal be- 
fore selecting which returns to audit. 

Under the cut-off audit scheme, audit resources are employed to verify reports of 
taxpayers reporting the lowest income levels. The operation of the nonrandom schemes 
depends on whether a binding budget constraint is imposed on the taxing authority. In 
the face of a binding budget constraint, the cut-off decision rule is to audit the eet 
reported incomes until the cut-off or budget constraint is met. 

In contrast, the conditional audit scheme requires, in addition to the See in- 
come, a source of information representing a noisy signal of the taxpayer’s true income- 
earning potential.’ This signal is used to classify taxpayers into groups some of which 
have greater ex ante chances for earning high income levels. These groups are used in 
conjunction with a cut-off such that within those groups with greater éx ante chances of 
high income, individuals reporting the lowest incomes are audited. That is, the pres- 
ence of a budget constraint operates for the high income group as with the pure cut-off 
audit scheme. 


The Taxpayers’ Joint Decision 


As stated earlier, taxpayers have two variables through which they can respond to a 
given tax environment. They can adjust the labor they supply, the amount of true in- 
come they report, or both. A simple analytical model based on Pencavel (1979) high- 
lights the jointness of the taxpayer’s decisions about work effort and reporting, and 
depicts the relationship among the variables in the tax environment we examine in this 
study. 

The taxpayer’s decision is portrayed as one of maximizing utility defined over re- 
ported income (y) and work effort {n}. In all cases, a random audit scheme with inde- 
pendent probability, p, of being audited is assumed. In one case, a proportional tax rate, 
t, is assumed.” Z is the taxpayer's true taxable income that is a function of n, and f is the 
penalty for underreporting that is a function of underreported taxes, Thus, after-tax 
income, given either truthful reporting or no audit, is 


= 2Z(n)—ty. (1) 
If the taxpayer underreports and there is an audit, after-tax income is represented by: 
Y*=Z(n)—ty—f[t(Z—y)]. (2) 


The taxpayer then is portrayed as maximizing the expected utility of the following pros- 
pect: 


EU(Y,n)=p[U(Y*,n)]+(1—p)[U(Y*,n)]. (3) 


‘ As discussed subsequently, the signal of true income-earning potential used in this experiment is the per- 
formance in the practice work session. 

2 Pencavel (1979) also relaxes the assumption of linear income tax schedules and considers the cases of re- 
gressive and progressive tax schedules. 
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Pencavel (1979) concludes that allowing the work and reporting decisions to be 
joint makes the effects of the tax (t), the penalty (f), and the probability of an audit (p) 
on reported income indeterminate. However, given a proportional tax rate schedule, 
the amount of underreporting, (Z—y), is strictly decreasing in the penalty, f, and the 
probability of a random audit, p. The effect of proportional taxes, t, remains indeter- 
minate. In addition, this model of the taxpayer’s joint labor and reporting decision can- 
not be tractably extended to capture the effects of nonrandom audit schemes. Thus, we 
turn to empirical analyses to examine the effect on income earned and income under- 
reported by taxpayers of nonrandom audit schemes, as well as other tax environment 
characteristics. 


Conjectures 


Since the current state of analytical modeling does not allow for advancing rigor- 
ous empirical predictions, we offer some conjectures consistent with previous litera- 
ture about how the audit schemes might be ordered and the effects of tax rates and pen- 
alty levels on the two dependent variables—underreporting of income and work effort 
(as measured by income actually earned). With regard to underreporting, we expect 
that the greatest amount of underreporting will occur under the random scheme, fol- 
lowed by the cut-off and conditional audit schemes, respectively. To facilitate compari- 
sons among the schemes, it is necessary that population audit rates be held constant 
across the three audit schemes. However, by the very nature of the nonrandom audit 
schemes, different deterrent effects result because the conditional probability of an 
audit for individual taxpayers varies with the likelihood of their underreporting. In a 
tax environment where self-reports are chosen according to a cut-off scheme, the 
conditional probability of an audit increases with the amount of underreporting for all 
subjects. Under the conditional audit scheme, taxpayers with the higher ability to earn 
income are audited, since they have a greater possibility of underreporting large 
amounts. Within this group, the conditional probability of being audited increases with 
the amount of underreporting. 

The expected effect of tax rate on underreporting is ambiguous. Most empirical 
studies tend to find either an insignificant or positive relation between tax rates and 
underreporting (see Jackson and Milliron 1986; Baldry 1987, 360 for a summary and 
discussion of these studies). However, it is difficult in many empirical studies to disen- 
tangle the effects of income and tax rates (e.g., Clotfelter 1983). We posit a positive 
relation between tax rates and underreporting will obtain in our experimental environ- 
ment where income is endogenous but independent of tax rate. We also expect the 
deterrent effects of any audit scheme will be greater the higher the penalty for noncom- 
pliance. 

The effects of tax rate, penalty, and audit schemes on work effort operate, at least 
partially, through their effect on underreporting, which may partially or completely 
negate the impact of taxes. Assuming that increased underreporting will result in in- 
creased work effort,? we expect the factors that generate positive effects on underre- 
porting also to have positive effects on effort. Thus, the effect of each variable on under- 
reporting will filter to effort when income is endogenous. 


3 This assumption is consistent with the financial effect (Atkinson and Stiglitz 1980, 27-28) and the empir- 
ical results of Collins et al. (1990). 
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Tax rate is expected to have both primary and secondary effects on effort. The pri- 
mary effect of rate on effort comes about through the income and substitution effects. If 
the substitution effect dominates the income effect (Swenson 1988), higher tax rates 
will have a negative impact on effort. However, if underreporting increases with tax 
rates and the financial effect is present, effort may not decrease with higher tax rates 
and, instead, might increase. Thus, the ultimate effect of tax rate on effort is expected to 
be small, since the financial effect mitigates an otherwise dominant substitution effect 
of tax rates on effort. 

The expected effect of penalty on effort is only secondary and takes place through 
underreporting. If lower penalties increase underreporting, then the substitution effect 
of taxes on effort is mitigated with lower penalties. Hence, greater effort is expected. 

Like tax rates, the audit schemes we examine may have both primary and second- 
ary effects on effort. As with underreporting, effort under the three schemes is ex- 
pected to be ordered from highest to lowest as follows: random, cut-off, and condi- 
tional. However, the audit selection rule in the cut-off and conditional schemes may 
also affect effort directly; greater effort for subjects in the cut-off scheme and for the 
high ex ante income subjects in the conditional scheme implies that they can under- 
report relatively more than others without being selected for audit. This complicates 
the secondary effect (through underreporting) of audit schemes on effort. The relative 
magnitudes of these conflicting primary and secondary effects cannot be predicted a 
priori, but they determine which audit scheme results in the highest effort. 


II. Experimental Methods 


A laboratory labor setting was used to test the effects of audit schemes, tax rates, 
and penalty levels on underreported income and work effort. The subjects worked at a 
computer-based decoding task for piece-rate compensation that was paid directly to 
the subjects in actual U.S. currency. All taxes and penalties were collected separately 
after making direct payments. The subjects’ unaudited tax payments were based on 
their self-reported income. Information asymmetry was present in that the subjects did, 
but the taxing authority did not, know their true income. The amount of actual income 
earned less the amount of income voluntarily reported represents underreporting 
(UNDER). The other dependent variable, EFFORT, is measured by the amount of actual 
income earned in performing the decoding task.‘ 


Experimental Design 


Three independent variables were manipulated as between-subjects factors: (1) 
RATE represents the proportional tax rate that was assessed on the subjects’ reported 
earnings at either of two levels: 30 percent or 60 percent; (2) AUDIT SCHEME is the de- 
cision rule the taxing authority followed in determining which reports to audit; and (3) 
PENALTY represents the taxing authority’s charge for underreporting. PENALTY was 
a function of underreported taxes and was manipulated at two levels. The penalty rate 
faced by each subject was either 1.2 times underreported taxes or 2.0 times under- 
reported taxes. 

The AUDIT SCHEME took on three levels—random, cut-off, and conditional. The 


* Effort and ability jointly affect performance. Subsequent analyses include a covariate to control for differ- 
ences in subjects’ abilities. 
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unconditional probability of a taxpayer’s report being audited was set at 20 percent and 
was held constant across all three schemes. In the case of the random scheme, each tax- ` 
payer’s report had an equal chance of being selected (20 percent) regardless of the level 
of reported income. In the cut-off and conditional audit environments, individual sub- 
jects’ conditional audit probabilities varied according to reported income relative to. 
other income reports. Since reports were private, subjects in these treatment groups 
were uncertain about their individual audit probabilities. In all environments, the sub- 
jects were aware of the audit decision rule being followed and the total number of re- 
ports to be selected.* 


Subjects’ Task 


The experimental task was a letter decoding exercise which was an extensively 
modified, computerized variation of tasks previously used by Chow (1983) and Chow et 
al. (1988). A computer program generated a sequence of 40 column card images con- 
taining randomly generated ‘‘punched-hole” combinations. Each card image displayed 
ten “‘punched-hole” combinations from which the subject decoded ten letters using a 
decoding key. Once the subject entered the letters for one card image into the compu- 
ter, another 40-column card image appeared. This task offered a straightforward mea- 
sure of the income produced: the number of letters correctly decoded per work period 
times the wage rate. 

Subjects were compensated on a piece-rate basis. Based on the results of a pretest, a 
before-tax wage rate (w) of $.08 per correctly decoded letter was established to ensure 
relative importance of the compensation to the subjects. On average subjects earned ap- 
proximately $10 net of taxes and possible penalties for participation in experimental 
sessions which lasted about 90 minutes.’ 


Experimental Procedure 


There were 12 treatments (3 x 2 x 2); one for each combination of AUDIT SCHEME, 
RATE, and PENALTY levels. Subjects participated, in groups of ten, in one of the 12 ex- 
perimental sessions. All of the subjects in each session had the same level of a given fac- 
tor, or treatment. 

The sequence of events for each experimental session is shown in table 1. Subjects 
met in a designated personal computer lab. As the subjects arrived, each was asked to 
choose a personal computer for a work station, was given an employee number, and 
was asked to complete a risk preference questionnaire which presented nine pairs of 
hypothetical payoffs. Each payoff pair represented a certainty and a gamble over two 
outcomes.’ 


* Asa result of repeated trials, some subjects in the conditional scheme may have become aware that they 
were in the “targeted” audit group (i.e., higher ex ante income potential}, while the others remained uncertain. 
However, perfect information about individual audit probabilities was not obtained. Even for those subjects 
who knew they were in the “targeted” group, their individual probability of audit still remained conditional on 
reported income. 

é The mean response to a debriefing question asking subjects whether they were compensated adequately 
(in cash) for the effort they expended while participating in this experiment was 4.18. Responses were mea- 
sured on a five-point scale where “strongly disagree” was scored as one and “strongly agree” was scored as 
five. 

? In the gamble, there was a 50 percent chance of getting one payment greater than the certain payoff and a 
50 percent chance of receiving another payment that was less than the certain payoff by an equivalent amount. 
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Table 1 
Sequence Experimental Events 





. Subjects select a work station and are assigned employee numbers. 
. Subjects complete risk questionnaire. 
. Lab Monitor distributes and discusses Instruction Set 1 (overview of experiment and introduction). 
. Computer program details hypothetical firm and work task and leads subjects through a training session. 
. Subjects perform practice work session. 
. Subjects complete work sessions 1 to 4. 
a. Subjects perform work and computer generates production reports. 
b. Subjects self-report taxes to tax collector. 
c. Tax collector selects two returns for audit. ; 
d. Lab monitor compares production report and tax return for audited subjects and records any applica- 
ble penalties on the returns. 
7. Subjects complete debriefing questionnaire. 
8. Subjects submit debriefing questionnaire and four production reports to bursar. Bursar pays total earned 
cash wages to subjects. 
9. Subjects submit tax returns to tax collector. Subjects pay total taxes plus any applicable penalties to col- 
lector. 
10. Conclusion of experiment. 


Ooh Go NR 


Subjects were then provided a set of instructions giving them an overview of the 
experimental sequence and describing their role in the study. To minimize any poten- 
tial demand effects, subjects were told that they were participating in a simulated use of 
computers in performing quality control tasks. Subjects read the hard copies of the in- 
structions and reviewed them with the lab monitor.’ The computer program began by 
describing the hypothetical firm and the work task to be performed in detail.’ Next, a 
training session, in which several cards were decoded, was completed by each subject 
to further familiarize them with the nature of the decoding task. 

Subjects then participated in a five-minute practice session. They were told to work 
the entire five minutes and to use the session to practice the task, so they could earn 
more in the paid work sessions. At the end of the practice session, the computer auto- 
matically printed a production report that listed the subject’s employee number, the 
work session number, the number of letters correctly decoded during the session, and 
the gross pay that would have been earned had this been a paid work session. The lab 
monitor collected the practice production reports. 





Thus, the expected value of the gamble was always equivalent to the certain payoff. The hypothetical situations 
consisted of three certain payoffs ($5.00, $7.50, and $12.50) and three gambles for each payoff. The three gam- 
bles ranged in their degree of dispersion from the certainty equivalent. Subjects were asked to indicate whether 
they would prefer the gamble or the certain payoff, or were indifferent between the two. The purpose of this 
rather simple measure of risk preference was to serve as a control to ensure risk preferences did not vary sig- 
nificantly across treatment groups. Also, it was included in some preliminary analyses as a covariate. 

? The roles of the lab monitor, the bursar, and the tax collector were performed by the experimenters and a 
doctoral student. 

°’ The computer software led each subject through several screens which described the hypothetical firm 
they were working for and the nature of the task. This was done to reinforce that exploring the use of computers 
in performing a quality control task was the objective of the experiment. Subjects were told that actual wages, 
taxes, audits, and penalties were being added to the scenario to increase its realism. Since each subject experi- 
enced only one treatment level and was unaware that any other treatments existed, potential demand effects 
were further minimized. 
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Specific instructions for work sessions one through four were distributed and re- 
viewed with the subjects to ensure thorough understanding. These instructions detailed 
the wage rate of $.08 per letter and described the tax environment in which the subjects 
would be working. The applicable tax rate, audit selection decision rule, and penalty 
rate assessed on underreported taxes were explained.'° The remaining sequence of ex- 
perimental events were discussed and subjects were allowed to ask questions. 

Four independent work sessions in which subjects were paid for their production 
followed. Before each session began, the subjects were given a new decoding key to 
minimize learning effects across sessions. At the completion of each work session, a 
production report was automatically generated. The subjects kept these reports and did 
not reveal them to anyone unless they were chosen to be audited that session. 

After the production report was generated each subject was asked to complete a tax 
return by typing into the computer the number of letters they wished to report. The 
computer calculated the resulting tax liability, displayed it on the screen, and queried 
the subject as to whether they would like to change the number of reported letters and 
have their tax recalculated. When the subject responded “no,” a tax report was auto- 
matically printed displaying only the final reported letters and the corresponding tax 
liability. 

At the end of each work session, the lab monitor collected only the tax reports and 
took them to the “tax collector,” who was positioned in the computer lab where subjects 
knew that she was unable to read computer screens or printed output. Regardless of the 
audit scheme used for each session, two of the ten tax reports were selected for audit 
in each work period. In the random audit environment, a bingo cage was used in which 
there was a total of ten balls, each with a number corresponding to one of the ten sub- 
ject identification numbers. The tax collector selected the tax reports by drawing two 
balls from the bingo cage. The cut-off selection scheme involved comparing the ten re- 
ports filed each work session and selecting for verification the two with the lowest 
reported income. Under the conditional audit decision rule, individuals were parti- 
tioned by the tax collector into two groups of five based on a preliminary signal of their 
ability to perform the task: their performance in the five-minute practice session. The 
conditional audit subjects were aware that this partition was made, but they did not 
know into which group anyone would be placed. The tax collector chose for audit the 
two reports with the lowest reported income from the tax returns of the individuals in 
the high performance group. 

After the tax collector followed the designated audit selection rule and wrote 
“audit” on two of the reports, the lab monitor returned the tax reports to the subjects. 
This occurred before the beginning of the next work session. As the next work session 
began, the monitor unobtrusively performed the two audits by comparing the actual 
letters shown on the production report to the reported letters on the tax return, and 
noting the amount of any penalty on the return. The tax collector was not aware of the 
results of any audit until the conclusion of the experiment when total taxes and pen- 
alties were collected. This procedure aided in increasing audit confidentiality among 


Consistent with the stated purpose of the experiment to the subjects (elaborated on in fn. 9), taxes, 
penalties, and audits were referred to by these terms. Although this empirical question is not yet resolved, a 
recent study (Alm et al. 1991) finds that the use or non-use of non-tax or neutral terms has no effect on the 
results, 
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the subjects and in assuring the subjects that the audit decision rule was being followed 
independently each work session and that the results of a previous audit had no effect 
on future audit decisions. 

This process was repeated four times for each of the 12 treatments. At the conclu- 
sion of the fourth period, subjects completed a debriefing questionnaire which included 
questions on demographics and clarity of instructions.’ Subjects then took their ques- 
tionnaire and four production reports to the “bursar,” where they were paid in cash the 
amount of their total earnings. After collecting their earnings, subjects submitted only 
their tax reports to the tax collector, who collected the total reported taxes plus any 
penalties. 


Subjects 


The subjects were 120 volunteers from undergraduate business or economics 
classes. The use of student subjects to perform this generic task was not problematic, 
because it required no particular physical expertise or educational training. More im- 
portantly, the phenomenon being investigated should occur for any type of subject as 
long as significant, real wages are paid for the work performed. 


III. Results 


The subject risk preferences and practice session scores are examined first. These 
scores served as controls to ensure that risk and ability did not differ significantly 
across treatment groups and possibly confound the experimental manipulations. The 
responses to each hypothetical payoff on the risk questionnaire were coded 0 for the 
certainty payoff, 1 for indifference, and 2 for the gamble. Each subject’s nine responses 
were summed to yield a total risk score ranging from zero to 18. The mean total risk 
score for all subjects was 6.9 (std. dev. =4.04). Since a score of nine would indicate 
risk neutrality, the subject sample tended to be risk averse. The results from the practice 
work session, which we consider to be a surrogate for ability to perform the work task, 
were used as a covariate in subsequent analyses. The average number of letters cor- 
rectly decoded, the practice score, was 43.55 (std. dev. 11.16). Comparing across the 
factor level groups, risk scores were not significantly different among the twelve treat- 
ment groups (F=0.80, p>F=0.64). Similar results were found for practice scores 
(F=0.90, p>F=0.55). 

The dependent variables were calculated for each subject for each work session as 
the income actually earned less the income voluntarily reported (UNDER), and the 
income actually earned (EFFORT).'2 A multivariate analysis of covariance (MAN- 


1" Mean responses, coded on a scale of one to five, to the instruction clarity questions were as follows: 


Written instructions were clear and unambiguous—3.9. 

Computer-provided instructions were clear and unambiguous—4.3 

Training session adequately prepared me for the work sessions—4.4 

Tax returns were selected for audit based on the procedures outlined in the instructions, and there was 
no tampering by those running the study—4.1 


The response scale was labeled strongly disagree (coded 1), disagree (coded 2), neither (coded 3), agree (coded 
4), and strongly agree (coded 5). 

"7 Income earned and the difference tn income earned and the income voluntarily reported are measured 
in the following analyses in terms of letters decoded and letters decoded less letters reported, respectively. A 
simple transformation of multiplying by the piece rate wage of $.08 will convert these measures to dollar 
amounts. 
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Table 2 
MANCOVA Results for UNDER and EFFORT! 


Individual Work Session 


Independent ee Average of 
Variables df 1 2 3 4 Work Sessions 2-4 

Practice-Covariate 2 54.11° 49.97° 37.42¢ 31.39° 53.95° 
RATE 2 0.34 1.20 2.91° 1.62 1.93 
PENALTY 2 0.34 1.97 0.93 1.20 1.49 
AUDIT SCHEME 4 3.15° 8.27° 5.81° 8.13° 7.34° 
RATE*PENALTY 2 1.85 3.88 4.14° 5.37* 4.88’ 
RATE. AUDIT 4 0.77 1.79 0.37 0.67 0.48 
PENALTY+«AUDIT 4 0.51 0.21 0.04 0.98 0.35 
RATE*PEN+AUDIT 4 5.62° 5.80° 3.49° 3.31° 4.62° 

* Significant at less than 0.10. 

* Significant at less than 0.05. 

* Significant at less than 0.01. 


1 This table reports the exact F-statistics and numerator degrees of freedom based on Wilks’ Lambda that 
result from MANCOVAs using underreporting (UNDER) and the letters produced (EFFORT) as the 
dependent variables. MANCOVAs for each of the four work sessions are shown separately, and an overall 
MANCOVA which used each dependent variable averaged across sessions 2 through 4 as the dependent vari- 
ables is shown in the last column. 


COVA), with Practice included as the covariate, was performed to examine the effect of 
each factor on UNDER and EFFORT. Although Risk and Practice were not significantly 
different across treatment groups, both variables were initially included as covariates 
in the MANCOVA and subsequent tests as an additional control procedure. However, 
only the Practice covariate was significant and retained in the reported analyses. 

The results of the MANCOVA for each work session and for the average of work 
sessions two through four are summarized in table 2. The Practice covariate, the 
AUDIT SCHEME main effect, and the RATE by PENALTY by AUDIT SCHEME inter- 
action were significant (p<0.05) in each individual work session as well as when the 
results from work sessions two through four were averaged. In addition, the RATE by 
PENALTY interaction was significant in sessions two through four, when examined ` 
separately or when averaged. As evidence of the relationship between the UNDER and 
EFFORT measures, the partial correlation coefficient for the two dependent variables 
was 0.25 (p<0.01) for the average results from sessions two through four and reached a 
maximum of 0.33 (p<0.01) in the fourth work session. 

While the results for the second, third, and fourth work sessions were comparable, 
the results for the first work session varied somewhat from the others. The RATE and 
PENALTY main effects and the interaction between these two variables were dimin- 
ished in that session. Although the subjects had performed the task in a training session 
and a practice session prior to the first work session, they had not done so with a tax 
system in place. Thus, the first work session was the first time the subjects had per- 
formed the task after being told their particular levels of RATE, PENALTY, and AUDIT 
SCHEME. It was also the first time they made a reporting decision. Hence, it is possible 
that some subjects were still becoming familiar with the decision implications of 
the experimental factors in the first work session. The subsequent analyses focus on 
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Table 3 


ANCOVA Results for UNDER and EFFORT" 


Average of 
Work Session 1 Work Sessions 2-4 
Independent Sum of Sum of 
Variables df Squares . F-Statistic Squares F-Statistic 
Practice-Covariate 1 401.59 2.69 485.97 2.00 
RATE 1 62.40 0.42 937.36 3.86" 
PENALTY 1 19.55 0.13 536.47 2.21 
AUDIT SCHEME 2 1,894.64 6.35° 5,968.86 12.30° 
RATE*PENALTY 2 339.73 2.28 2,281.77 9.40° 
RATE*AUDIT 2 63.27 0.21 61.47 0.13 
PENALTY:AUDIT 2 103.15 0.35 103.13 0.21 
RATE+PENeAUDIT 2 3,086.11 10.35° 4,205.81 8.67* 
ERROR 108 15,801.53 25,724.85 
Panel B. EFFORT: 
Average of 
Work Session 1 Work Sessions 2—4 
Independent Sum of Sum of 
Variables df Squares F-Statistic Squares - F-Statistic 

Practice-Covariate 1 5,294.82 109.20° 5,193.99 107.57° 
RATE 1 17.85 0.37 4.76 0.10 
PENALTY 1 22.68 0.47 73.59 1.52 
AUDIT SCHEME 1 27.53 0.28 231.51 2.40“ 
RATE. PENALTY 2 55.87 1.15 0.05 0.00 
RATE. AUDIT 2 134.94 1.39 72.03 0.75 
PENALTY.AUDIT 2 63.34 0.65 53.32 0.55 
RATE*PEN*AUDIT 2 151.95 1.57 201.48 2.09 
ERROR 106 5,139.87 6,118.02 

“ Significant at less than 0.10. 


Significant at less than 0.05. 


© Significant at less than 0.01. 

1 This table reports the F-statistics that result from ANCOVAs using underreporting (UNDER) in panel A, 
and the letters produced (EFFORT), in panel B, as the dependent variables. The ANCOVA for the first work 
session and an overall ANCOVA which used the dependent variable averaged across sessions 2 through 4 as 
the dependent variable are shown. 


work sessions two through four to examine the experimental manipulations, but the 
results from the first work session are presented separately for comparative purposes. 

As follow-up tests to the MANCOVA, separate analyses of covariance (ANCOVAs) 
were performed on the two dependent variables. The results are presented in panel A 
(Dep. Var. = UNDER) and panel B (Dep. Var. = EFFORT) of table 3 for session one alone 
and sessions two through four averaged.” The RATE by PENALTY by AUDIT 


4 There are ten usable observations in each treatment group other than the high RATE, low PENALTY, and 
cut-off AUDIT SCHEME condition. Due to missing values, there are only nine usable observations in this cell. 
Since the cell sizes are almost exactly equivalent, the results are extremely robust to any departures from the 

normality or homogeneity of variance assumptions (see Kirk 1982, 74-84). 
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Figure 1 
AUDIT SCHEME Means for Each RATE/PENALTY Condition ` 
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1 UNDER is the average underreporting for the last three work sessions. 


SCHEME and RATE by PENALTY significant interactions in the MANCOVA seem to 
be caused by the subjects’ underreporting. Both interactions are statistically significant 
at less than the 0.01 level in panel A and not significant in panel B. 

Figure 1 illustrates the mean UNDER values averaged for sessions two through four 
by AUDIT SCHEME for each RATE and PENALTY combination. This figure shows 
that both the significant two-way and three-way interactions seem to be driven by the 
low RATE, high PENALTY, random AUDIT SCHEME group." Under the other two 
AUDIT SCHEMES, given a constant RATE level, a higher PENALTY results in less 
underreporting. Also, for the high RATE, random AUDIT group, a higher PENALTY 
had the expected negative effect on underreporting behavior. However, a higher 
PENALTY had a puzzling, positive effect on underreporting behavior for the low 
RATE, random AUDIT group. Similarly, when PENALTY is low, a higher RATE results 
in greater underreporting for each AUDIT SCHEME. When PENALTY is high, a higher 
RATE results in a greater amount of underreporting for each AUDIT SCHEME, except 
the random one." 

Investigation of the risk scores revealed that the mean risk score for the low 
RATE, high PENALTY, and random AUDIT treatment group (R=L, P=H, A=R) 
was 9.9, which was the highest of the 12 treatment groups. The other 11 mean risk 


'* Separate ANCOVAs for each AUDIT SCHEME indicate that the RATE by PENALTY interaction is 
significant only for the random AUDIT SCHEME (p<0.001). 
18 This difference is relatively small in the conditional AUDIT SCHEME. 
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scores ranged from 5.0 to 7.7. Although the risk scores were not significantly different, 
it appears that the relative risk-seeking nature of this treatment group (R=L, P=H, 
A=R) at least partially accounts for the unanticipated behavior of increased under- 
reporting as the penalty level increased and tax rate decreased and for the two-way and 
three-way interactions. 

Although one must be cautious when interpreting main effects in the presence of 
significant higher-order interactions, the results in table 3 for the average of work ses- 
sions two through four indicate that the AUDIT SCHEME main effect was significant 
for the UNDER and EFFORT dependent variables at p<0.01 and p<0.10, respectively. 
In addition, table 4, panel A reveals that the schemes were ordered as expected with the 
most underreporting occurring under the random scheme, followed by the cut-off, and 
then conditional. The UNDER mean in periods two through four for the random 
AUDIT SCHEME was significantly greater (p <0.01) than both the cut-off and condi- 
tional means. However, the cut-off and conditional means were not significantly dif- 
ferent. 

Although most underreporting occurred in the random AUDIT SCHEME, the 
greatest EFFORT was found in the cut-off AUDIT SCHEME (see table 4, panel B). In the 
final work session, the average EFFORT for the cut-off scheme was significantly greater 
than those of both the random and conditional schemes (p<0.05). Thus, as expected, 
AUDIT SCHEME seemed to have a primary and secondary effect on EFFORT. The pri- 
mary effect was particularly strong in the cut-off scheme. Here all subjects might 
believe that they could decrease their individual chance of audit for a given level of 
underreported income by producing more. 

The RATE main effect was significant at the 0.05 level in the ANCOVA for the de- 
pendent variable UNDER (see table 3, panel A). In addition, panel A of table 4 reveals 
that UNDER was significantly greater for the high RATE group in the third and fourth 
work sessions (and the average of the last three work sessions). Due to increased under- 
reporting in the high RATE group, the financial effect may have mitigated what might 
have otherwise been a dominant substitution effect of RATE on EFFORT, which may 
account for the insignificant effect of RATE on EFFORT. 

The effect of PENALTY on UNDER and EFFORT was not significant (see tables 3 
and 4). More underreporting and effort occurred.in the low PENALTY condition for 
each work session except the first. However, the differences were not significant in any 
of the work sessions. 

Additionally, those who elected to underreport income from one or more of the 
total letters produced in any session produced an average of 56 letters per session. In 
contrast, those who did not underreport income had an average production of 51 letters 
per session. The difference between the means was significant at p<0.05, which pro- 
vides further evidence that capturing the taxpayer’s response to the tax system as a 
joint decision of labor supply and voluntary reporting was appropriate. 


Discussion 


Three results merit further discussion. First, subjects in our experiment behaved as 
if their work effort and reporting decision were made jointly and hence moved to- 
gether. This finding has important implications for the expanding problem of the 
underground economy (the value of all economic activity that is unrecorded in national 
income accounts and is presumably untaxed). Although previous estimates of the size 
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Panel A. UNDER: 


Panel B. EFFORT: 


Factor 
RATE 


PENALTY 


AUDIT SCHEME 


Level 
Low 
High 
Low 
High 


Random 
Cut-off 


Random 
Conditional 


Cut-off 
Conditional 


Level 
Low 
High 
Low 
High 


Random 
Cut-off 


Random 
Conditional 


Cut-off 
Conditional 


* Significant at less than 0.10. 
* Significant at less than 0.05. 
e Significant at less than 0.01. 
! This table shows the adjusted means of the three experimental factors for each of the four work sessions 
and averaged across the last three work sessions (2 through 4) adjusted for the covariate, Practice, and the sig- 


nificance levels for pair-wise comparisons using planned T-tests. 


Table 4 
UNDER and EFFORT Mean Comparisons by Factor Levels! ` 


Work Sessions 
2 3 
11.29 10.68° 
15.71 17.80 
15.75 16.37 
11.24 12.21 
23.36° 23.84" 
8.82 10.15 
23.36° 23.84° 
8.31 8.88 
8.82 10.15 
8.31 8.88 
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Average of 
Work Sessions 2—4 


11.72* 
17.39 


16.70 


12.41 


24.51° 
10.33 


24.51° 
8.82 


10.33 
8.82 


Average of 
Work Sessions 2-4 


54.59 
54.89 


55.58 
53.99 


54.84 
56.49 


54.84 
53.03 


58.49* 
53.03 


of the underground economy are quite variable due to differences in the details and 
assumptions of the research, all are quite large. For example, a sample of original esti- 
mates by four researchers indicates a range for the 1976 underground economy from $40 
billion to $389 billion (Feige 1979; Gutmann 1977; Henry 1976, 1983; Tanzi 1982). Our 
findings suggest the possibility that the growth in the underground economy may be stim- 
ulated not only from the self-selection of those seeking noncompliance opportunities into 
these economic activities, but also from an additional labor supply response. Those 
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working in the underground economy may take further advantage of their noncompli- 
ance opportunities by supplying more labor and thus exacerbating the problem. 

Second, the increased underreporting discovered in the high RATE treatment 
groups is consistent with Clotfelter’s (1983) positive estimate of the elasticity of unre- 
ported income with respect to marginal tax rates. He used disaggregated data from the 
Internal Revenue Service’s Taxpayer Compliance Measurement Program (TCMP) sur- 
vey for 1969 to examine the relationship between tax evasion and tax rates. Because of 
data limitations, Clotfelter’s conclusions need more confirmation. In particular, it is 
difficult to disentangle the tax rate effect from the income effects in the 1969 TCMP 
sample, since the tax structure by definition links tax rates to income levels. Thus, the 
marginal tax rates observed in this type of data are not truly exogenous, as is the case in 
our experimental setting.'® ; 

It would be useful to extend this research to determine whether there is a 50 percent 
threshold effect. Do higher tax rates at all levels lead to more underreporting in an 
experimental environment characterized by endogenous income, or is the effect only 
found when comparing a tax rate below 50 percent to a rate greater than 50 percent? 
Relevant considerations include relaxing the budget constraint imposed in our experi- 
mental environment, since a higher tax rate should increase both the incentive to 
underreport and the incentive to audit. 

Finally, conditional audits were not significantly more effective than cut-off audits 
in preventing underreporting. Subjects in the conditional audit treatment groups 
always were aware that a targeted audit scheme was being followed, and experience 
may have allowed them to infer whether they were in the “untargeted’’ audit group. 
Thus, while the conditional scheme may be more effective in curbing underreporting 
in the high skill audit group, underreporting may have been stimulated in the ‘“un- 
targeted” group.” 

Evidence of this behavior is found when the conditional audit scheme subjects are 
categorized by the audit group in which they were placed after the practice work session 
(SKILL). Four separate analyses of variance with UNDER for each work session as the 
dependent variable and SKILL, RATE, and PENALTY as the independent variables 
found SKILL was only significant in work sessions three and four (p<0.01). 

In actuality, the IRS follows strategic auditing. However, due to the much larger 
sizes of the targeted and untargeted groups in the actual tax environment, the informa- 
tion update is not as complete as was possible in our experiment. Nevertheless, our 
results could partially explain why the IRS publicizes that their strategic auditing pro- 
cedures are supplemented by random audit selections. This combination prevents 
those taxpayers who, as a result of no or low prior audit experience, begin to perceive 
that they are not a “targeted” group taxpayer by assuming that they are not subject to 
any type of enforcement procedure. 


ts In addition, the effect of marginal tax rates is not confounded with complexity in our experimental sst- 


ng. 

'7 Unaudited subjects were not able to rule themselves out of the targeted group. Only in the low RATE, 
high PENALTY group did all five of the targeted subjects get audited. However, this was not accomplished 
until work session four, so that subjects were unable to use this information. 
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IV. Conclusion 


This is the first study that experimentally investigates the effect of audit schemes 
that incorporate taxpayers’ self-reported income as well as tax rates and penalty levels 
when allowing true income to vary endogenously. The results indicate that “reported” 
and “actual” income do vary concurrently: those electing to underreport also earn 
more (actual) income. In addition, audit schemes that incorporate some preliminary in- 
formation signal sent by the taxpayer may be more successful in curbing underreport- 
ing than purely random audit models. In examining the effectiveness of these nonran- 
dom schemes across tax rate and penalty levels, they are most effective when tax rates 
are low and penalty levels are rather high. 

Due to the experimental aspects of this research, specific policy recommendations 
are not warranted at this time. Potential generalizability depends on future work that 
examines the sensitivity of these results to other tex rates, penalty levels, and audit bud- 
get resources. The results indicate that further research allowing actual income earned 
to be determined jointly with the reporting decision may be warranted. Also, future 
studies that investigate other signal-dependent audit schemes and perhaps combine a 
signal dependent scheme with a random scheme are necessary. 
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The income tax has made more liars out of the American people than golf has. 
Will Rogers 


AX evasion appears to be a large and growing problem in the United States. De- 

spite obvious difficulties in measurement, the Internal Revenue Service (1990) 

estimates that the tax gap, or the amount of underreported federal income taxes, 
was $83~-94 billion in 1987, and had grown at an annual rate of over ten percent since 
1973. Such underreporting reduces the tax revenues of the federal government, affects 
public provision of goods and services, creates misallocations in resource use, alters 
the distribution of income in unpredictable ways, and increases feelings of unfair treat- 
ment and disrespect for the law. 

The analysis of the individual reporting decision has taken a variety of approaches. 
The underlying premise of nearly all of these approaches, at least those in economics, 
has been the same: individuals pay taxes because they fear detection and punishment. 
This economics-of-crime approach is based on traditional expected utility theory, 
which views a rational individual as weighing the expected utility of benefits from suc- 
cessful underreporting against the uncertain prospect of detection and punishment. 

This approach has generated numerous insights.' There are, however, several 
fundamental problems with the existing applications of expected utility theory to tax- 
payer reporting. First, although it is clear that enforcement actions affect compliance to 
a degree, it is equally clear that detection and punishment cannot explain all compli- 
ance behavior.? The frequency of audit in the United States has fallen to less than one 
percent, and the additional penalties constitute only a fraction of the unpaid tax liabil- 
ity. According to expected utility theory, most individuals should choose to under- 
report all of their taxes, at least on income not subject to third-party reporting. How- 
ever, the extent of compliance with the income tax remains relatively high, which 


1 See Cowell (1990) for a comprehensive review of much of this literature. 
2 “Compliance” is defined here as reporting all income and paying all taxes in accordance with the appli- 
cable laws, regulations, and court decisions. 
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cannot be due to enforcement activities alone. Other factors must play a dominant role 
in the reporting decision. Indeed, the puzzle of tax compliance is why people pay taxes, 
not why they evade them. The challenge facing those who analyze the reporting deci- 
sion is to explain these levels of compliance, as well as the changes in compliance that 
result from policy changes. 

Second, the failure of expected utility in the analysis of tax compliance reflects a 
growing dissatisfaction with this approach in the analysis of individual choice under 
uncertainty. Numerous recent surveys (Fishburn 1988; Machina 1987, 1989; Sugden 
1988) document evidence from a variety of areas showing that individuals in many cir- 
cumstances do not in fact maximize expected utility. In response, alternative non- 
expected utility models of behavior have been advanced. However, these theories have 
not yet been fully analyzed, and none have been applied to tax compliance. 

In this perspective, I discuss and assess what we have learned from research on the 
individual taxpayer reporting decision, with attention to the specific evidence of Beck 
et al. (1991b) and of Collins and Plumlee (1991) in this Forum. I draw two general con- 
clusions from this literature, which parallel the above problems with the economic 
analysis of tax compliance. First, compliance is a complicated decision, one that clearly 
depends in part upon the financial incentives facing the taxpayer but one that also de- 
pends upon other factors that have not been incorporated into the expected utility 
theory application. This deficiency renders all existing applications of expected utility 
theories in tax compliance (including my own work) incapable of explaining much of 
the actual tax reporting decisions of individuals. Second, and as a consequence, 
broader approaches to the reporting decision that encompass non-expected utility 
theories of behavior are needed.? 

Sull, the existing experimental analyses of taxpayer reporting decisions make a 
contribution to this area. For example, Beck et al. (1991b) allow taxable income to be 
uncertain, and find that such uncertainty has significant and complex effects on 
reported income.‘ Collins and Plumlee (1991) recognize that the probability of audit 
depends in part on the choices of the taxpayer, and find that such audit endogeneity 
affects the joint reporting and labor supply decision. These are important and new 
results. Their work also shows that comparative statics can be usefully employed to ex- 
plain the changes in the reporting decisions of individuals in response to changes in 
various policy parameters or institutions. However, if the goal instead is to explain the 
level of compliance, then a new approach is needed that incorporates the major ele- 
ments of the reporting decision in a theory that may well depart from expected utility 
theory; that is, existing theories of tax compliance must be modified to consider both 
additional elements of the tax reporting decision and alternative approaches to be- 
havior under uncertainty. This need can be helped by various methods, including those 
of experimental economics. 

The remainder of this article is structured as follows. Section I briefly reviews 
much of the non-experimental literature on tax compliance. Section II discusses the 


3 Note that the focus is on individual tax compliance. Of perhaps equal importance is corporate compliance. 
However, with the exception of Rice (1990), corporate compliance has received little attention. 

* It should be noted that Beck et al. (1991b) analyze taxpayer aggressiveness, defined as a willingness to re- 
port a low income when it is uncertain whether a high or a low income is the correct, post-audit tax base and 
when either report has legal justification. Although there are differences between “aggressiveness,” "report, 
ing,” and “compliance,” these terms will be used interchangeably in this perspective. 
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application of experimental methods to compliance research, including the appropriate 
procedures that should be followed in experimental analysis, the results that have been 
derived from previous studies, and the limitations of experimental methods. Section III 
reviews the work of Beck et al. (1991b) and of Collins and Plumlee (1991). Section IV 
presents some conclusions and observations. 


I. Non-Experimental Research on Tax Compliance 
Theoretical Literature 


The dominant approach to the analysis of taxpayer reporting follows the eco- 
nomics-of-crime methodology first applied to tax compliance by Allingham and 
Sandmo (1972).° In this model, an individual is assumed to receive a fixed amount of 
income I, and must choose how much of this income to declare to the tax authorities 
and how much to underreport. The individual pays taxes at rate t on every dollar D of 
income that is declared, but pays no taxes on underreported income. However, the in- 
dividual may be audited with some fixed probability p; if audited, then all underre- 
ported income is discovered, and the individual must pay a penalty at rate f on each 
dollar of deficient taxes, where f includes the unpaid taxes. If underreporting is de- 
tected, the individual’s disposable income equals: 


I-=I-—tD—ft(I—D), (1) 
while if underreporting is not detected income is: 
I NZ I—tD. (2) 


Expected utility theory then suggests that the individual will choose D to maximize the 
expected utility EU (I) of the evasion gamble: 


EU (I)=pU(Ic)+(1—p)U (Ix), (3) 


where utility U(I) is assumed to be a function of income. In this model, an increase in 
the probability of detection p and the penalty rate f unambiguously increase declared 
income D. An increase in the tax rate t also increases reported income if the 
individual’s preferences exhibit decreasing absolute risk aversion.® In this case, an in- 
crease in the tax rate increases the benefits of underreporting income but also increases 
in equal proportion the cost of underreporting, such that the relative price of under- 
reporting is unchanged. However, the higher tax rate also lowers income and increases 
the individual’s risk aversion, which leads to more reported income. 

This basic model has been extended to allow for different settings. For example, the 
individual may choose declared income jointly with additional variables, such as labor 
supply or income allocated to tax avoidance. Alternative tax and penalty functions can 
be introduced. The impact of complexity and uncertainty about the relevant fiscal 
parameters can be analyzed, as can the effect of tax preparers in resolving this uncer- 
tainty. Government can be allowed to spend the taxes that are collected. The assump- 
tion that the probability of detection is fixed for an individual (a random audit strategy) 


5 There are, of course, other approaches that do not invoke the expected utility hypothesis. See Roth et al. 
(1989) for a detailed discussion. 

¢ Absolute risk aversion A(I) is defined as A{J)= —U’ ’(1)/U’(I), where a prime (‘) denotes a partial de- 
rivative. It is commonly assumed that A{I) decreases with income. 
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can be relaxed by allowing the audit agency to use information from the taxpayers’ re- 
turns in determining whom to select for audit and by examining the interaction of the 
taxpayers and the government collection agency in a game theory setting.’ 

Virtually all theoretical work has continued to use the expected utility model, and it 
has generated many insights, especially regarding how an individual responds to 
greater enforcement activities and how government can optimally choose its enforce- 
ment strategy. However, this literature is in a sense too complex. It is only in the sim- 
pler models that clear-cut analytical results can be generated on the compliance impact 
of basic policy parameters. When more complex dimensions of individual behavior are 
introduced, the theoretical results generally become ambiguous. It is doubtful that the 
theoretical analysis will yield more meaningful results in the future? 

Paradoxically, the theoretical models of individual choice are also too simple. The 
Internal Revenue Service (1978) has listed 64 factors that may affect the reporting deci- 
sions of taxpayers, but theoretical models are capable of including only a few. The 
limited ability to incorporate many relevant factors or to incorporate them in a mean- 
ingful way has meant that these theories are often unable to explain the level of tax 
reporting, even when they are capable of explaining changes in reporting in response to 
policy innovations. In particular, as emphasized by Graetz and Wilde (1985) and others, 
these models generally imply that rational individuals should pay far less in taxes than 
they actually do? 

In short, expected utility theory implies that individuals should pay less in taxes 
than they in fact do. This result suggests that the compliance decision must be affected 
by other factors not mentioned by expected utility theory or must be affected in ways 
not captured by the theory. 


Empirical Literature 


The obvious difficulty in applied work is the absence of reliable information on in- 
dividual reporting behavior. To facilitate empirical research, the IRS has begun to make 
data available to researchers through its Taxpayer Compliance Measurement Program 
(TCMP), and virtually all empirical models of compliance in the United States are based 
on these data. For example, Clotfelter (1983), Witte and Woodbury (1985), and Dubin 
and Wilde (1988) have used different forms of these data to estimate the impact of 
policy parameters (e.g., marginal tax rates, audit rates, and penalty rates) upon various 
direct and indirect measures of tax evasion. These studies have found the unsurprising 


7 See Roth et al. (1989) and Cowell (1990) for comprehensive surveys of this theoretical literature. 

® For example, Pencavel (1979, 123-24) demonstrates that when labor supply is endogenous the impact on 
reported income of changes in the fine rate and the probability of detection become ambiguous. He concludes 
that the results of the existing literature are ‘‘fragile” because they are “generated from a set of rather narrow 
postulates.” 

° To illustrate, consider again the standard expected utility model of Allingham and Sandmo (1972). 
Suppose that the general utility function U (I) is replaced with the specific function I}*/(1—e), where the sub- 
script i refers to the state of the world (i=C, N) and e is a measure of the individual’s constant relative aversion ` 
to risk. Using the definitions of income in equations (1) and (2), the maximization in equation (3) can be solved 
for the optimum amount of reported income D*. Now suppose that D* is calculated for specific parameter 
values. For example, if t=0.30, f=2.0, p=0.02, and e= 1, then an individual will optimally report zero income. 
Even when the probability of detection equals 20 percent, the optimal D* is zero. Very large—and unrealistic— 
values for relative risk aversion are required to generate compliance levels broadly consistent with actual 
United States experience. Simple modifications to the basic model do not significantly change the results. 

10 However, see Alm, Bahl, and Murray (1990) for empirical work that uses Jamaican data. 
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results that an increase in the marginal tax rate decreases reported income and that an 
increase in audit rates increases reported income. 

The empirical work has, however, several major weaknesses. The most trouble- 
some is the quality of the data. Most data are from the TCMP, which contains a detailed 
line-by-line audit of a stratified random sample of roughly 50,000 individual tax returns 
conducted on a three-year cycle. These audits yield an IRS estimate of the taxpayer’s 
“true” income so that a measure of individual tax evasion can be calculated. However, 
only Clotfelter (1983) has had access to the microlevel estimates of evasion." Most 
researchers use TCMP data aggregated to the three-digit zip code level, an aggregate 
measure likely to comprise disparate elements of underreporting that reflect very dif- 
ferent motivational factors. TCMP data also have some well-recognized deficiencies: 
the audits do not detect all underreported income, nonfilers are not captured, honest 
errors are not identified, final audit adjustments are not included, and there are few 
noneconomic factors to which the data can be linked. The use of TCMP data for 
empirical estimation of the determinants of compliance behavior is therefore problem- 
atic. 

Most empirical work is also plagued by problems of endogeneity, particularly in 
measures of the audit rate. As correctly argued by Dubin and Wilde (1988), if reported 
income depends upon the probability of detection and the probability of detection also 
depends upon reported income—that is, if IRS behavior is endogenous, so that the 
assumption of a random audit strategy is inappropriate—then using the audit rate as an 
explanatory factor leads to simultaneity bias. Only Dubin and Wilde (1988) and Dubin 
et al. (1990) consider the possible endogeneity of the audit rate in their econometric esti- 
mation; however, they use aggregate, not individual, measures of reported income. 

To avoid the problems with the TCMP data, Crane and Nourzad (1986) examine the 
impact of inflation on aggregate evasion, defined as the gap between income reported 
on tax returns and income in the national income accounts. Dubin et al. (1990) estimate 
the impact of audit rates on reported income of the states, with data on reported income 
by state collected from annual reports of the IRS. By necessity, these studies focus on 
the aggregate, not the individual, response. 

Surveys of taxpayers have also been used to assess factors such as perceptions of 
the probability of detection, the fairness of taxation, and the responsiveness of govern- 
ment in the respondent’s reporting decision.'? Unfortunately, these surveys are also 
subject to a number of methodological problems. As emphasized by Klepper and Nagin 
(1989), individuals may not remember their reporting decisions, they may not respond 
at all, or they may not respond truthfully. Surveys are also unable to control for many 
relevant determinants of compliance. Finally, they cannot determine the direction of 
causality between compliance and its determinants; that is, statements regarding the 
unfairness of a tax may result from a rationalization of noncompliance rather than be 
the cause of noncompliance. 


u The IRS has recently begun to allow researchers access to the micro level data, on a controlled basis. 
2 This literature is quite large. See, e.g., Westat, Inc. (1980), Mason and Calvin (1984), Thurman et al. 
(1984), and Yankelovich, Skelly, and White, Inc. (1984). Also, see the discussion in Roth et al. (1989). 
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II. Experimental Analysis of Taxpayer Reporting 


Difficulties with the existing theoretical and empirical literatures have led to the 
use of experimental economics. The use of laboratory experiments in economics began 
in the early 1960s with the work of Siegel and Fouraker (1960) and Smith (1962, 1964) on 
resource allocation under alternative forms of market organization. Growth in its appli- 
cations came with the establishment of a well-defined framework for experimental 
work by Smith (1976, 1982) and Wilde (1980). 

Laboratory experiments seem particularly well-suited for the study of some aspects 
of the taxpayer reporting decision. Experiments are not as constrained by the same 
degree of simplification present in analytical studies of reporting, which allows the im- 
pact of numerous factors not amenable to theoretical work to be examined. Unlike em- 
pirical work, experiments generate data under different settings in which there is con- 
trol over extraneous influences. As discussed below, there are some obvious limitations 
of experimental methods. However, given the weaknesses of other methodologies, 
there are compelling reasons for the use of experiments. 


Creating a Microeconomic System: Induced Value Theory 


Experimental economics involves the creation of a real microeconomic system in 
the laboratory. The essence of such a system is control over the environment, the insti- 
tutions, and the preferences that subjects face. Of these, control over preferences is par- 
ticularly crucial. As stated by Smith (1976, 275), “[s]uch control can be achieved by us- 
ing a reward structure to induce prescribed monetary value on actions.” 

Smith (1982) identifies several (sufficient) conditions that must be satisfied for con- 
trol over preferences to be established: (1) nonsatiation—subjects must prefer more to 
less; (2) saliency—the rewards received by subjects must be related to their decisions, so 
that subjects recognize that their actions affect their outcomes; (3) reward dominance— 
rewards must be large enough to offset any subjective costs or benefits that subjects 
place on participation in the experiment, which requires the payment to subjects of an 
amount comparable to what they could earn outside the laboratory; and (4) privacy— 
each subject must know only his or her own payoffs so that they do not receive any sub- 
jective value from the payoffs of other subjects. 

Davis and Swenson (1988) identify several procedures that must also be followed in 
experimental economics. For example, the experiment should be administered in a uni- 
form and consistent manner to allow replicability. The experiment should not be ex- 
cessively long or complicated, since subjects may become bored or confused. Subjects 
must believe that the procedures described to them are the procedures actually fol- 
lowed. 

The instructions provided to subjects are of particular importance. The instructions 
should be understandable, should avoid the use of examples that lead subjects to anchor 
on certain choices that are the subject of the experiment, and should be phrased in neu- 
tral rather than loaded terms, to mask the context of the experiment and avoid direct 
reference to the real-world phenomena under investigation. Neutrality increases the ex- 
perimenter’s control over subject preferences and avoids leading subjects to invoke dif- 
ferent “mental scripts,” which may enable them to fill in (potentially) missing informa- 
tion in the instructions but which also may unpredictably influence their choices. It is 
sometimes claimed that the use of neutral instructions limits the ability to generalize 
from the experimental to the naturally occurring setting. In fact, however, it is not 


Alm—Experimental Analysis of Taxpayer Reporting 583 


possible to generalize beyond the laboratory unless one uses neutral instructions, since 
the experimenter cannot control (or induce) the values that subjects associate with 
loaded terms. 


Previous Experimental Work on Taxpayer Compliance 


The basic design of most compliance experiments is similar. Human subjects in a 
controlled laboratory are told that they should feel free to make as much income as pos- 
sible. At the beginning of each round of the experiment, each subject is given income 
and must decide how much income to report. Taxes are paid at some rate on all re- 
ported, but not on underreported, income. However, underreporting is discovered with 
some probability, and the subject must pay a fine on unpaid taxes. This process is 
repeated for a given number of rounds. At the completion of the experiment, each 
subject is paid an amount (say, the accumulated earnings) that depends on his or her 
performance during the experiment. Into this microeconomic system, various policy 
changes can be introduced. 

In the first experimental study of tax compliance, Friedland et al. (1978) examine 
subject responses to changes in tax rates, penalties, and frequencies of audits. They 
find that an increase in the tax rate increases both the probability of underreporting and 
the level of underreporting, and that large fines are more effective deterrents than fre- 
quent audits, although increases in either fines or audit rates increase reported income. 

A number of experimental studies have followed. Spicer and Becker (1980) find that 
compliance is lower among subjects who are told that their tax rate is higher than that 
of others, and that compliance is higher among those who are told that their tax rate is 
lower than others. They conclude that perceptions of ‘‘fiscal inequity” affect compli- 
ance. In a similar vein, Becker et al. (1987) introduce a public sector transfer scheme, 
and find that individual compliance declines if the subject believes that he or she re- 
ceives less than others. Spicer and Thomas (1982) find that the compliance decision is 
affected by the presence of uncertain audit probabilities, although the effect is some- 
what complicated. Spicer and Hero (1985) and Webley (1987) analyze the impact of 
audits on post-audit compliance, and find that subjects who have been audited report 
more income. Baldry (1986) finds that some individuals do not cheat because of moral 
reasons. Collins et al. (1990) allow a joint choice of work effort and reporting, and find 
that subjects who can underreport tend to work harder than subjects who have no 
opportunity for noncompliance. 

In a series of recent papers, my co-authors and I have extended the analysis of tax- 
payer compliance in several directions. Alm, McKee, and Beck (1990) find that the in- 
troduction of a tax amnesty in which subjects can pay previously unpaid taxes without 
penalty lowers post-amnesty compliance, although a well-designed amnesty is able to 
reverse this decline. Alm, McClelland, and Schulze (forthcoming) find that many sub- 
jects overweight the low probability of audit, which leads them to pay more in taxes 
than suggested by expected utility theory. They also find that the introduction of a 
public good financed by the taxes increases compliance. Alm et al. (1991a) find that the 
impact of taxpayer uncertainty about the tax rate, the fine rate, and the audit rate 
depends critically on the presence or absence of a public good. When subjects receive 
something for their tax payments, uncertainty always lowers compliance; when there is 
no public expenditure, uncertainty always has the opposite effect. Alm et al. (1991b) 
report higher compliance when individuals vote on the use of their tax payments than 
when the identical result is imposed on them; further, compliance is higher when the 
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vote is decisive rather than close. Alm, Jackson, and McKee (forthcoming) show that 
positive rewards to compliant taxpayers have a significant, positive effect on tax com- 
pliance. 

In total, these experimental studies suggest the following conclusions: 


1. Individuals report less income as the tax rate rises. 

2. An increase in the penalty rate increases compliance. 

3. More frequent audits encourage greater compliance. 

4, Many individuals overweight the probability of audit, behaving as if the proba- 
bility is higher than it actually is. 

5. Compliance is reduced when individuals feel that they are treated unfairly rela- 
tive to others, regardless of the source of the inequity (from the tax or the ex- 
penditure side). 

6. The compliance decision is made jointly with the labor supply decision. 

7. Individuals pay more in taxes when they receive something for their tax pay- 
ments. 

8. Compliance can be encouraged by rewards as well as penalties. 

9. The institutions that determine the use of tax payments affect compliance. Indi- 
viduals pay more in taxes when faced with a public program that they select 
and that they know enjoys widespread support. 

10. The introduction of uncertain tax policies increases compliance in the absence 
of government expenditures and decreases compliance in their presence. 

11. A tax amnesty lowers post-amnesty compliance if the amnesty is not well- 
designed. 

12. Some individuals pay taxes because they believe that cheating is wrong. 


Some of these results (e.g., the impact of fines and audit rates) are similar to those found 
in the theoretical and empirical literatures, but others are not {e.g., the effect of tax 
rates), and most cannot be derived from the theoretical and empirical analysis of tax- 
payer reporting. As discussed later, some of these conclusions are altered by the experi- 
ments of Beck et al. (1991b) and of Collins and Plumlee (1991) included in this Forum. 


Limitations of Experimental Economics 


There are sound reasons for caution in interpreting and generalizing these experi- 
mental results. Some early experiments did not follow some now widely accepted pro- 
cedures of the experimental paradigm, such as the use of repeated experiments and 
neutral instructions. Much work also lacks realism because values of the various policy 
parameters do not approximate real-world values. 

Although more recent experimental research has generally addressed these prob- 
lems, some concerns remain, some of which are more real than others. A common criti- 
cism of experimental economics is that the student subjects typically used may not be 
representative of taxpayers. However, there is now much evidence that the experi- 
mental responses of students are no different than the responses of other subject pools 
(Plott 1987). 

Of more legitimate concern, the results may well be sensitive to the specific experi- 
mental design, so that replication is crucial. It is also possible that subjects modify their 
behavior simply because they know that they are participating in an experiment. Most 
importantly, there is a certain artificiality in a laboratory setting. A decision to report 
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$2 in an experiment is clearly different from a decision to report actual income on an 
annual tax return, even if the laboratory incentives are salient. In particular, the labora- 
tory setting cannot capture a catastrophic loss such as jail, and it cannot capture the 
social stigma that some surveys suggest is an important factor in taxpayer reporting. 
There is also some evidence from Webley-and Halstead (1986) that subjects may behave 
differently in the laboratory than in the naturally occurring world. 

In short, one must use the results from laboratory experiments with some care. 
However, such use depends largely upon the purpose of the experiment. According to 
Roth (1987), experiments can be classified into three broad categories that depend upon 
the dialogue in which they are meant to participate: “Speaking to Theorists” includes 
those experiments designed to test well-articulated theories; “Searching for Facts” in- 
volves experiments that examine the effects of variables about which existing theory 
has little to say; and “Whispering in the Ears of Princes” identifies those experiments 
motivated by specific policy issues. To date, most experiments on taxpayer reporting 
fall into the first two categories. It is likely to be some time before a serious dialogue 
with the princes of the IRS is established. 


DL Experimental Work in this Forum 


The experiments of Beck, Davis, and Jung (BDJ) and of Collins and Plumlee (CP) in 
this Forum examine different aspects of the taxpayer reporting decision. BDJ focus on 
taxpayer uncertainty about the true level of taxable income, while CP introduce non- 
random audit selection schemes. Each study finds that reporting is affected by the spe- 
cific factor introduced. These results are new and important, and represent some of the 
first attempts to systematically examine these issues. 


The Beck, Davis, and Jung Study 


Taxpayer uncertainty arises because of imprecision and complexity in the tax code, 
lack of uniform training and abilities among government auditors, taxpayer ignorance 
of penalties or the factors that cause an audit, frequent changes in the tax code, or 
merely the ongoing potential for changes in the code. A growing theoretical literature 
(Beck and Jung 1989; Scotchmer and Slemrod 1989) suggests that an individual who 
maximizes expected utility may respond to greater uncertainty by increasing reported 
income. There is also some anecdotal evidence that the IRS has deliberately maintained 
some ambiguity in its policies in order to generate such reactions. There is, however, 
little evidence to support this prediction, especially when there is uncertainty about the 
level of taxable income.” 

BDT provide this evidence. They examine the effects of taxable income uncertainty 
by assuming that a subject does not know with certainty his or her taxable income. 
Rather, in the event of an audit, a subject’s taxable income is determined by a drawing 
from a probability distribution of taxable incomes in which possible values of taxable 
income range uniformly from a low to a high value. 

The theoretical model of BDJ is rich and allows them to derive a number of specific 
hypotheses. Some of these predictions are fairly standard. For example, an individual is 
always predicted to increase reported income when either the probability of detection 


vu Note, however, that Spicer and Thomas (1982) examine the impact of uncertainty about the probability of 
detection, and Alm et al. (1991a) analyze the effects of uncertainty about the tax, audit, and fine rates. 
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or the fine rate increases, and a risk-averse individual is predicted to increase reported 
income when the tax rate increases. However, other hypotheses are more complicated. 
In particular, an increase in taxable income uncertainty (measured by the range of tax- 
able income in the uniform distribution) may either increase or decrease reported 
income, depending upon the degree of risk aversion and the values of the audit and fine 
rates. In general, greater uncertainty is more likely to increase reported income the 
higher are the probability of audit and the penalty. 

BD] then test these predictions in the laboratory. Their experimental design follows 
closely both their theory and accepted experimental procedures. Subjects receive in- 
come at the beginning of each round and choose the amount of income to report. They 
pay taxes on income they choose to report; however, they do not know with certainty 
their “true” taxable income. If audited with some fixed random probability, the actual 
taxable income is determined by a drawing from a uniform distribution, operation- 
alized by the use of a bingo cage containing balls sequentially numbered over a known 
range. A low level of uncertainty is represented by a narrow range in the numbered 
balls, and a high level of uncertainty is represented by a wide range. Audits are deter- 
mined by the roll of a die, and a penalty is imposed on deficient taxes. Because their 
theory suggests that attitudes toward risk are an important determinant of reporting be- 
havior, BDJ also attempt to induce risk neutrality and risk preference using the method 
suggested by Berg et al. (1986). Several levels of the tax rate, the fine rate, and the proba- 
bility of detection are used. The experiment lasts 60 rounds. 

BD]’s experimental results are generally consistent with their hypotheses, at least in 
the comparative statics of the reporting decision. For the risk-neutral experiments, an 
increase in the fine rate or the probability of detection increases reported income, a 
change in the tax rate does not affect reported income, and an increase in uncertainty 
increases reported income when either the fine rate or the probability is increased. For 
the risk-averse experiments, the comparative statics results are somewhat weaker. The 
impact of the tax rate on reported income is only marginally significant, although a 
higher tax rate tends to increase reported income as predicted. Similarly, the effect of 
uncertainty on reported income is only marginally significant; again, the direction of 
change in reported income is consistent with the hypothesis (e.g., an increase in uncer- 
tainty decreases reported income when the tax rate is low, and increases reported in- 
come when the tax rate is high). 

The hypotheses on the levels of reported income are, however, not generally veri- 
fied by the experimental results. The actual levels of reported income are consistently 
higher than predicted, especially for the experiments in which risk aversion is induced. 
As noted earlier, such a result is common. 

These results are of considerable interest. They are well-grounded in (expected util- 
ity) theory, and they are generated from a sound experimental design. Further, these 
results expand considerably our experimental knowledge of behavioral responses. 
They confirm previous experimental results on the impact of audit and penalty rates on 
reported income. Unlike the results of earlier work, however, these suggest that the ef- 
fect of an increase in tax rates may increase reported income, and the effects may also 
depend upon attitudes toward risk. Most importantly, the results confirm the theoret- 
ical speculation that uncertainty affects compliance, although the effects depend criti- 
cally upon the degree of risk aversion and the levels of tax, audit, and fine rates. 

Nevertheless, conclusions and generalizations from these results must be made 
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with caution. First, the theoretical and experimental results depend critically on the 
assumption of a known distribution of reported incomes, and a uniform distribution is 
used to derive explicit theoretical solutions for reported income. However, taxpayers 
are unlikely to know the actual distribution of taxable incomes. Moreover, the uniform 
distribution leads to some odd kinds of behavior, since subjects clearly have no incen- 
tive to report income outside the ranges of the distribution. A more realistic assumption 
might be to have taxable incomes distributed normally around some mean level; this 
would allow an increase in uncertainty to be analyzed as a mean-preserving spread 
without changing the supports of the distribution. 

Second, BD] use unrealistically high values (0.4, 0.5, and 0.9) for the probability of 
audit. (BDJ also assume that the probability is fixed and random.) These high levels may 
be necessary to test the hypotheses. However, given that the percentage of tax returns 
selected for audit is now less than one percent, the levels in the experiment make it un- 
likely that the experimental results will generalize to actual reporting behavior." It 
would be useful to replicate these experiments with more realistic parameter values. 

Third, the experimental design relies heavily upon the method of Berg et al. (1986) 
for inducing risk preferences. The use of this method requires that subject native pref- 
erences be linear in the probabilities. More precisely, it requires that the independence 
axiom of expected utility theory holds.** If this assumption does not hold, then prefer- 
ences cannot be induced. However, there is a growing literature that argues convinc- 
ingly that preferences are not in fact linear in the probabilities (see, e.g., Machina 
1989).*° More generally, in their theory and inducing mechanism BDJ require that indi- 
viduals maximize expected utility. As discussed earlier, there is enormous evidence to 
the contrary. 

The maintained assumption of expected utility theory may explain the general 
failure of the experimental results to accurately predict the levels of reported income, 
even though the conditions for doing so are optimal (e.g., perfect information, non- 
extreme parameter values), However, the difficulties in predicting levels of compliance 
should not be surprising, since expected utility theory is often incapable of explaining 
taxpayer reporting. 


The Collins and Plumlee Study 


The early theoretical models of tax compliance generally assumed that individuals 
faced a fixed random probability of audit. However, this assumption ignores a central 
feature of the compliance system in the United States: each individual must submit a 
tax return to the IRS, each return directly conveys information to the IRS, and it is 


4 It is, however, risky to assume that actual audit rates are a true measure of the probability of detection, 
given the ability of the IRS to match third-party information reports to tax returns. Furthermore, some classes 
of taxpayers are subject to much higher than average audit rates because of the level or the nature of incomes in 
these classes. Finally, the perceived audit probabilities of many taxpayers are often much greater than the 
actual rates. 

18 The independence axiom states that the evaluation of a lottery is not affected by the replacement of some 
element in the lottery with another element that is indifferent to it. Formally, the lottery [x] is preferred to the 
lottery [y] if and only if the combined lottery [x with probability p, z with probability (1—p)] is preferred to the 
combined lottery f y with probability p, z with probability (1 — p)]. It can be shown that the Independence axiom 
is equivalent to the hypothesis that the individual preference function takes the expected utility form. See 
Machina (1989). 

6 Note that BDJ conduct some experiments in which risk preferences are measured, rather than induced, 
in order to overcome this limitation. 


588 The Accounting Review, July 1991 


partly on the basis of this preliminary transmission of information that the IRS deter- 
mines whom to select for audit. Thus, the probability of audit is neither fixed nor ran- 
dom, but depends in part upon the actions of the taxpayers. 

Although recognition of this obvious feature of the compliance game has been slow, 
there are now several theoretical analyses of the taxpayer-agency interaction (Beck and 
Jung 1989; Graetz et al. 1986; and Reinganum and Wilde 1985, 1986). These studies gen- 
erally find that the presence of strategic audits significantly affects individual behavior. 
In particular, they demonstrate that the tax agency can generate greater levels of re- 
porting under a strategic audit policy than under a random audit policy. 

The analytical results are, however, only hypotheses, and CP are the first to exam- 
ine experimentally the implications of a strategic audit policy." They do this by intro- 
ducing three audit schemes. The first is the standard random audit scheme. The second 
is a cutoff audit scheme, in which only those returns with the lowest levels of reported 
income are selected for audit. The third is called a conditional audit scheme. Here a 
preliminary piece of information is used by the agency to classify taxpayers into cate- 
gories in which the agency believes there are different ex ante probabilities of higher in- 
come levels; a cutoff scheme is then applied to the category with the higher ex ante level 
of income. CP examine the effect of these differerit audit strategies on two jointly deter- 
mined dimensions of choice: the amount of income earned (or effort) and the level of 
underreporting. 

The experimental design of CP is straightforward but creative, especially in the 
operationalization of the audit schemes. Before the start of each round, each of ten sub- 
jects must earn income by performing a decoding exercise. Subjects then choose the 
amount of their earnings to report as taxable income, on which they pay taxes. If 
audited, they pay a penalty equal to a multiple of the deficient taxes. Two different tax 
rates and penalty rates are used. The three audit schemes are designed such that the ex 
post audit rate is 20 percent for each. In the random audit scheme, two of the ten sub- 
jects are randomly chosen; in the cutoff strategy, the two returns with the lowest levels 
of reported income are selected; in the conditional scheme, the ten reports are first par- 
titioned into two groups of five on the basis of earnings in a practice round, and then 
the two reports with the lowest levels of reported income in the high earnings category 
are selected. The experiment lasts four rounds. 

The experimental results of CP are broadly consistent with their speculations, but 
are not overly strong, especially for effort. The audit strategy is found to have a signifi- 
cant effect on compliance. The most underreporting occurs with the random strategy, 
and the least with the conditional scheme, as expected. CP also tend to find (with some 
notable exceptions) that an increase in the tax rate increases underreporting, while an 
increase in the penalty rate lowers underreporting. However, the results for effort seem 
generally disappointing. Although the audit scheme has a significant effect on effort, 
the results are not consistent with the expectations of CP.'* The effects of the tax rate 
and the penalty rate on effort are not significant. 

Like those of BDJ, these results are of some importance, especially those regarding 


‘7 Also, see Beck et al. (19914). 

18 The results for effort are, however, plausible. The highest level of effort is found in the cutoff audit rule, 
not the postulated random scheme. It is likely that subjects realize that greater effort would allow them to 
underreport more and still have enough reported income to place them above the cutoff level. 


Alm—Experimental Analysis of Taxpayer Reporting 589 


the impact of the audit scheme on underreporting. For the first time, there is experi- 
mental confirmation of the theoretical speculation on the importance of the audit pro- 
cess, and there is evidence that the tax authority can indeed do better than a simple ran- 
dom audit strategy. 

There are, however, reasons for some caution. First, some results are weak and 
some are counterintuitive. As noted, effort is little affected by policy changes. Even for 
underreporting, the impact of the audit scheme is far from uniform. For example, the 
two tax/penalty combinations flow/low] and [high/high] generate roughly the same 
level of underreporting for all three audit schemes. The impact of a penalty increase is 
generally small, and in one case an increase in the penalty actualy increases under- 
reporting. 

Second, despite its use in other experimental studies, it is possible that the decoding 
exercise is not a true measure of “effort.’”’ Rather, it seems more likely to indicate abil- 
ity, either innate or learned. If the decoding exercise is more accurately seen as a test of 
innate ability, then it is not surprising that the various policies have little impact on 
“effort.” Moreover, if subjects become more skilled in decoding with more practice, 
then it is also not surprising that ‘‘effort” generally increases over the four rounds of the 
experiment. 

Indeed, despite the sound theoretical reasons for analyzing the joint effort-under- 
reporting choice, potential problems with decoding as a true measure of effort argue for 
a replication of the experiment in which income is assumed fixed. It may also be that 
the joint choice of effort and underreporting complicates unnecessarily the subjects’ 
decisions. 

Third, although the experimental design is ‘generally sound, some of its features 
raise concern. For example, there are relatively few rounds (four) in each experiment, and 
there is some evidence that behavior changes over these rounds. The earnings of the sub- 
jects seem low for the length of the experiment. Also, CP use loaded instructions. Admit- 
tedly, there is some work (Alm, McClelland, and Schulze forthcoming) that suggests that 
this makes no difference to the results It is also evidence that economic factors cannot 
explain all compliance behavior, and economic factors are all that neutral terms can in- 
troduce in an experiment; that is, context is important in explaining compliance be- 
havior, and context can be introduced only by the use of loaded terms. However, the 
experimenter inevitably loses some control over preferences by the use of loaded in- 
structions. These features suggest that further replication is desirable. 

Fourth, there are some aspects of strategic audits that are not captured, either by 
the theoretical literature on which the experiments are based or by the experiments 
themselves. For example, unlike subjects in the experiments of CP, taxpayers do not 
know the IRS audit selection rule; in fact, the IRS has gone to some length to ensure the 
secrecy of its rule. There is also a dynamic aspect to the IRS audit selection that is 
ignored in the theory and the experiments: IRS discovery of underreported income this 
year may lead to audits both of previous and of future years. There are additional pieces 
of information that are known by the IRS and that may be used in its audit selection. It 
is reasonable to assume that the IRS knows the distribution of incomes of all taxpayers, 


© Alm, McClelland, and Schulze (forthcoming) find that two compliance experiments in which the only 
difference is the use of loaded and neutral instructions give virtually identical results. They attribute this result 


to providing subjects with complete and precise information, making mental scripts erte 
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even if it does not know the income of a specific taxpayer. Such information is not 
present in the experiments of CP, Finally, the theoretical and experimental work exam- 
ines only one type of interaction present in the tax compliance game, that between the 
taxpayer and the agency, and omits the game among the taxpayers themselves. 
Finally, and most importantly, as with nearly all work on taxpayer reporting, the re- 
sults of CP show much higher levels of reporting than is predicted by the theory. There 
are some difficulties in generating predictions, given that CP allow joint choice of effort 
and underreporting and that the probability of audit is sometimes nonrandom. Sup- 
pose, however, that one uses the standard expected utility model in which income is 
fixed and audits are random. Calculations of the optimal level of reported income sug- 
gest that for most of the parameter value combinations used by CP the level of compli- 
ance should be zero. The introduction of a variable labor supply complicates these cal- 
culations, but it is possible to show that even here compliance should be at or near zero. 


IV. Summary and Conclusions 


Laboratory methods have considerably expanded our knowledge of the taxpayer re- 
porting decision. These studies have examined issues that cannot be handled fruitfully 
by theory, and have given information on individual behavior in areas in which reliable 
data simply do not exist. More broadly, they have suggested new factors that seem 
likely to affect compliance behavior, as well as approaches that may prove useful in the 
analysis of reporting. Although experimental methods have their limitations, difficul- 
ties in the theoretical and empirical methodologies of compliance research make it im- 
perative that the potential contributions of the experimental approach be recognized 
and its use be maintained. 

The work of Beck et al. (1991b) and of Collins and Plumlee (1991) adds to this ac- 
cumulating experimental evidence. BD] verify their theoretical propositions that uncer- 
tainty about taxable income significantly affects reported income. In particular, their 
results suggest that uncertainty seems most likely to lower reporting, given the actual 
levels of the various fiscal parameters. CP find that the nature of the audit scheme has a 
significant effect on tax compliance, and provide support for the common perception 
that the IRS may well know what it is doing in its nonrandom selection of returns. Both 
sets of results are new and interesting. 

However, as emphasized earlier, there are two major conclusions to be drawn from 
the literature on tax compliance. First, the compliance decision depends upon 
numerous factors, including but not limited to the enforcement activities of the govern- 
ment. Second, the incorporation of these factors may well require a move beyond clas- 
sical expected utility theory and into theories of behavior suggested by psychologists, 
sociologists, and anthropologists.?° Until these lessons are acted upon, I think it un- 
likely that we will be able to explain much of the reporting decisions of taxpayers. In 
particular, although we may continue to have some success in explaining the changes 
in reporting behavior, we will fail to explain the level of reporting. 

The major factors that must be considered in any theory of compliance are, I be- 
lieve, obvious. One demonstrated factor is the threat of detection and punishment, 
including the recognition that individuals may overweight a low probability of detec- 


2 See, for example, the discussion in Roth et al. (1989) and the surveys of Machina (1987, 1989). 
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tion and that this probability depends in part on one’s actions. Another factor is govern- 
ment expenditures. Individuals may pay taxes because they value the public goods pro- 
vided by government and they recognize that their taxes may be necessary to get others 
to contribute to the finance of the public goods, both now and in the future.”! Social 
norms also enter the reporting decision. Survey evidence indicates that compliance is 
strongly affected by the taxpayer’s commitment to the social norm of compliance, that 
those who believe that others cheat are more likely to cheat themselves, and that an in- 
dividual’s perception of how he or she is treated relative to others affects compliance.” 
There are, of course, additional factors: uncertainty, the role of tax practitioners, the 
process by which government decisions are made, and so on. 

Incorporation of these factors in a formal theory of compliance is a difficult task. It 
is a task that also is likely to require the exploration of alternative theories of behavior 
under uncertainty. It is, finally, a task in which experimental economics can play a 
major role. It may not be possible to develop one theory that explains the behavior of all 

-individuals at all times, or even one that explains the actions of the same person at all 
times. However, until this effort is made, I think it unlikely that we will come e much 
closer to explaining the compliance puzzle. 


| There is a large literature that supports the notion that compliance depends in part on the use of tax rev- 
enues. Some of this work (e.g., Isaac et al. 1985) looks at voluntary contributions to public goods. Other work 
(e.g., Alm, McClelland, and Schulze forthcoming) looks directly at the impact of government expenditures on 
compliance. Both strands often conclude that individuals pay more as the benefit from their contributions 
increases. See Alm et al. {1991b) for a more detailed discussion. 

7 See, for example, the studies cited in footnote 12. 
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SYNOPSIS: Under existing generally accepted accounting principles, no 
compensation expense is recorded for executive stock options (ESOs) if the 
exercise price on the date of grant is equal to (or greater than) the market 
price of the stock. Similarly, only negligible compensation expense tends to 
be recorded if the exercise price on the date of grant is less than the market 
price of the stock. The inadequacy of this method (see Boudreaux and Zeff 
1976; Smith and Zimmerman 1976; and Weygandt 1977) has led the 
Financial Accounting Standards Board (FASB) to consider a proposal to 
measure compensation related to grants of ESOs at their fair values, with a 
lower bound constraint. A candidate model for the estimation of fair value 
(Swieringa 1987) is the Black and Scholes (B-S) (1973) pricing model with 
the Merton (1973) modification that allows for continuous-dividends. It 
would seem natural (and we infer that the FASB would opt) to use the 
continuous-dividend version of the B-S model for firms that pay cash divi- 
dends and the no-dividend version for firms that do not pay dividends. 
Note that if cash dividends are assumed to be zero (as would be the case 
for firms that do not pay dividends), the continuous-dividend version re- 
duces to the original B-S formulation. Thus, we label the B-S continuous- 
dividend model (subject to the stipulation that the B-S estimate not be less 
than the number yielded by the FASB’s minimum-value model) as the 
FASB proposal. This labeling applies whether the grant date or the vesting 
date is considered to be the measurement date (discussed below). 

To examine the income effect of changing the accounting method of 
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ESOs, this study applies the FASB’s proposal to a random sample of firms 
that granted stock options in order to assess the impact of the related com- 
pensation expense on operating income. A second objective is to compare 
ESO compensation estimates from the two models underlying the FASB 
proposal: (1) the B-S continuous-dividend model, and (2) the FASB’s 
minimum-value procedure (discussed subsequently). 

In addition, the latest FASB proposal requires that stock option com- 
pensation be measured as of the vesting date, as opposed to the date of 
grant. Thus, a third objective is to provide evidence as to whether vesting- 
date estimates of ESO compensation are significantly different from ESO 
estimates generated on the grant date. 

The results indicate that, using a three percent materiality threshold, 

_more non-dividend paying firms (about 30 percent) would have material 
income effects than dividend-paying firms (about eight percent). Further- 
more, using alternative measures of service periods shorter than the lives of 
options, produces material ESO compensation expense for a high percent- 
age of sample firms. Finally, applying the FASB proposal on the basis of 
the vesting date would result in a lower income effect than applying it on 
the basis of the date of grant. In general, material income effects are 
observed when the FASB’s proposal is adopted irrespective of the 
valuation model used. 


Key Words: FASB, Stock options, Executive compensation. 
Data Availability: Data are available on request from the authors. 


HE remainder of the article proceeds with a summary of the FASB’s ESO pro- 

posal.' Section II considers theoretical issues that arise in applying the B-S model 

in the context of ESQs, and section III describes the sample. Section IV concerns 
the methods employed to obtain the necessary variance, dividend yield, and risk-free 
rate values, and our results are presented in section V. The final section contains 
concluding remarks. 


I. The FASB Proposal 


In its latest proposal (Swieringa 1987), the FASB tentatively concluded that com- 
pensation costs for grants? of fixed-term’ ESOs should be calculated and recorded as of 


1 The FASB’s proposal is currently on the board’s agenda, although the issue of stock compensation, which 
includes ESOs, will be discussed in Phase Three of the FASB’s financial instruments project—Debt/ Equity 
Characteristics. Whether the financial instruments project will affect the FASB proposal remains an open 
issue. See, for example, Ernst & Young (1890). 

2 Currently, ESOs generally are of three types: nonqualified, qualified, and incentive stock options. Irre- 
spective of which type, boards of directors typically grant ESOs with exercise prices equal to the market prices 
of the related stock on the date of the grant. To qualify for favorable tax treatment under the Internal Revenue 
Code as either an incentive or qualified stock option, an ESO must be granted with an exercise price that is not 
less than the market price on the date of grant. Nonqualified stock options do not meet the statutory require- 
ments and may be granted with exercise prices lower than the market price on the date of grant. The stock op- 
tions examined in the current study are restricted to (1) incentive stock options, (2) qualified stock options, and 
(3) nonqualified options with the exercise price set at no less than the market price on the date of grant. In other 


Foster II, Koogler, and Vickrey—Executive Stock Options 597 


the vesting date.* The board proposed that a fair-value model‘ be used to estimate com- 
pensation cost with the stipulation that the fair value cannot be lower than the estimate 
generated by the board’s minimum-value model. The board defined minimum value as 
the market price of the grantor’s optioned stock on the vesting date minus the present 
value of the exercise price and the cash dividends expected to be distributed during the 
period in which ESOs remain outstanding, with zero as a lower bound (Swieringa 
1987). The board’s minimum-value model is represented by the expression: 


max([0,(S~ PV(X)—PV(D))], (1) 


where PV denotes the calculation of present value with discrete compounding and a 
risk-free rate, S is the market price of the optioned stock on the vesting date, X repre- 
sents the exercise price of the ESO, and D is the per-share dividend yield. 


II. The B-S Model and ESO Compensation Estimates 


The Black and Scholes (B-S) model (1973) provides the equilibrium value of a Euro- 
pean call option (i.e., one not exercisable until maturity), and the theory of option valua- 
tion is discussed extensively in Merton (1973) and in Smith (1976). The similarity of call 
options and ESOs is reviewed in Noreen and Wolfson (1981).° 

ESOs are similar to European call options in all attributes except exercisability, 
marketability, and the probability of forfeiture should key employees leave the firm 
before the vesting date. Unlike European options, ESOs can be exercised prior to the 
expiration date. Merton (1973) contains a proof that a rational investor will never exer- 
cise an option early if the optioned stock does not pay dividends. Thus, the early-exer- 
cise feature does not affect the value of an American option’ (i.e., one exercisable at any 





words, the grants of stock options in this study include only those for which zero compensation was recorded. 
To make certain that no stock option compensation was recorded, we did not consider firms with tandem stock 
option/stock appreciation rights plans. 

* The terms of ESO plans may be fixed or variable. Under fixed-term plans, both the number of shares and 
exercise price are known on the grant date; under variable plans, at least one of these factors is not known on 
this date and is contingent upon future events. We focus only on fixed-term plans because of the formidable dif- 
ficulty in estimating the compensation related to variable plans, given extant disclosures and uncertainties 
about future events. 

* The selection of the vesting date was a change from the FASB’s initial position that compensation costs 
for ESOs should be calculated and recorded as of the grant date. Robbins (1988, 66) describes this shift as... 
surprising, since vesting-date measurement was certainly not a front-runner at the inception of the project. In- 
deed, the arguments for and against vesting date were tucked away in an appendix of the invitation to comment 
rather than in the main body of the document. The rationale given for the change was that the grantee does not 
have an unconditional right to the ESOs until the vesting date. We believe that compensation related to ESOs 
should be measured when the transaction occurs—on the date of grant. We address the implications of using 
the vesting date in a later section. 

5 Since there is uncertainty about the FASB’s final position on ESO accounting, we also applied other 
option-valuation models individually to assign values to ESOs in assessing the effects of ESO compensation on 
operating income. The additional models are (1) the B-S model (the upper boundary whether or not the firm 
pays dividends), and (2} the lower bound on the B-S continuous-dividend model. As shown later, our overall 
conclusion is that ESOs are likely to have a material impact on operating income under the FASB proposal. A 
similar conclusion applies in the case of each of the alternative models, but, for purposes of brevity, we do not 
discuss the specific results from these models. 

¢ Swieringa (1987) also identifies an employees stock option as a call option contract. 

” Roll (1977) demonstrated that the value of an unprotected American option approaches the B-S valuation 
when the ex-dividend date is used in the model in place of the option’s contracted expiration date. Roll’s analy- 
sis depends on only one dividend payment of a known magnitude between the contract entry date and the 9x- 
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time up to the expiration date) on a nondividend stock 5 ESOs cannot be traded; when a 
key employee leaves the firm before his or her ESOs have vested, the options are for- 
feited. These restrictions are potentially severe departures from option-pricing theory. 

In the Black and Scholes (1973) formulation, the price of a European call option is a 
function of two variables (stock price and time to expiration) and three parameters 
(exercise price, risk-free interest rate, and the ex ante variance of the returns on the op- 
tion security). Merton (1973) provides a modification of the basic B-S model to allow for 
cash dividends that are distributed continuously and are a constant proportion of the 
optioned share price. This continuous-dividend version of the B-S model is: 


V=[e te 4S g(Z)—~ eX G(Z—ov tJ], (2) 


where V is the value of the option, S is the stock price, X is the exercise price, r is the 
risk-free rate,’ g? is the variance of return on the optioned stock, t is the time until the 
option matures, k is the continuous-dividend yield as a constant proportion of the 
underlying share price, déis) is the cumulative normal density function, and Z is 
defined as: 


Z=s[fn(S/X)+(én(i+r)—en(1+k)+07/2)}t)/ovt. 


For ESOs on nondividend stocks, the continuous-dividend model reduces to the unad- 
justed B-S formula.'° 

For the B-S continuous-dividend model, the stock price, exercise price, time to 
expiration, and risk-free rate are relatively easy to observe. The greatest estimation 
problems involve the variance rate and the dividend yield. The problems with the esti- 
mation of the variance have been documented in the finance and economics literature’ 
(see Black and Scholes 1972; Boyle and Ananthanarayanan 1977; Galai 1989; Galai and 
Schneller 1978; Macbeth and Merville 1979; Merton 1976). In addition, the model speci- 
fies a continuous-dividend yield that is a constant proportion of the share price. How- 
ever, firms usually distribute cash dividends quarterly Do, cash dividends are discrete), 
and these distributions will not probably equate to a yield that is a constant proportion 
of share price. 


HI. The Sample 


A sample of 214 grants of ESOs was obtained from a random sample of 560 annual 
reports taken from approximately 3,500 annual reports for the fiscal year 1985 on 





piration date. Geske (1978) derives a formula for the value of a European option on a stock with a stochastic 
dividend yield, but his model relies on unobservable parameters and thus would probably not be suitable for 
accounting applications. 

° The early-exercise feature is potentially valuable if the optioned stock pays cash dividends. Optionees 
(executives in our study) can decide to participate in dividend distributions by exercising ESOs prior to expira- 
tion. Executives would take advantage of the early-exercise opportunity (ignoring the tax effects and cash re- 
quirements at exercise) when they predict a greater gain from dividend distributions than from holding the 
ESO until maturity. The early-exercise feature suggests the possibility of an option value at least as great as that 
produced by the B-S continuous-dividend model (European option). 

° In all of our applications of the B-S models, we followed Noreen and Wolfson (1981) and used ¢n(1+ r) and 
fn(1+k) as the continuous risk-free rate and the continuous-dividend yield, respectively. 

‘© Recent research has employed the continuous-dividend version to assign values to ESO forms of com- 
pensation (see, e.g., Antle and Smith 1985; Murphy 1985). 


F 
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microfiche in the University of Arizona library. All 560 firms in the random sample 
granted ESOs to their key employees, but footnote disclosure of the ESO plans and 
grants under those plans was sometimes inadequate for the purposes of this research." 
To ensure a final sample of at least 200 ESO grants,’ the following decisions were 
made. When a firm disclosed a range of exercise prices, the midpoint of the range was 
designated to be the exercise price (approximately 35 percent of the cases). When the 
firm disclosed the aggregate exercise price for ESOs granted during fiscal year 1985, 
the aggregate amount was divided by the number of ESOs granted, and this weighted 
average was used as the exercise price (approximately ten percent of the cases). When a 
firm failed to explicitly disclose the time to expiration, this period was considered to be 
ten years (approximately 35 percent of the cases). When a firm did not disclose the date 
of grant, the midpoint of the fiscal year was designated as the date of grant (100 percent 
of the cases).'* 

As mentioned earlier, we believe that the date of grant is the appropriate date on 
which to measure total ESO compensation; consequently, our initial analysis assumes 
grant-date accounting. We address vesting-date accounting subsequently. To apply the 
FASB’s minimum-value model, we need the exercise price of the ESO, the market price 
of the optioned stock on the date of grant, the risk-free rate, the life.of the ESO, and, 
when applicable, the dividend yield. In addition, we need an estimate of the variance of 
returns to apply the B-S continuous-dividend model 15 To estimate the total compensa- 
tion that would be associated with a firm granting stock options under the FASB’s 
proposal, we first compare the appropriate B-S estimate with the number yielded by the 
FASB’s minimum-value procedure. The larger of the two is multiplied by the number of 
options granted in fiscal 1985 to yield the total compensation associated with the grant- 
ing of the ESOs. Had the sampled firms applied these procedures, the result would have ` 
been the creation of a deferred compensation account. This deferred compensation 
would then be amortized as a charge to income over the period that the executive per- 
forms his or her services. 

The choice of amortization period requires explanation. There are three candi- 
dates: (1) the time to expiration—the period that begins on the date of grant and ends on 

the date the ESOs expire; (2) the vesting period—the time from the date of grant to the 


’ 4 Companies were eliminated from consideration for various reasons, including the following: (1) the 
number of options granted was not disclosed (nine firms), (2) the exercise price of the options granted during 
the period was not disclosed with sufficient clarity—te., disclosure of a range of exercise prices since the in- 
ception of the plan (131 firms), (3) both the exercise price and the number of options granted were not clearly 
disclosed (110 firms), (4) the full range of prices needed for variance computations was not available on the 1989 
CRSP tape (41 firms), and (5} we were unable to clearly ascertain that no compensation pertaining to stock op- 
tions had been recorded (55 firms). 

12 The necessary data could not be obtained from proxy statements. Proxy statements contain details on 
ESO plans, but only in the year that a particular plan is presented to shareholders for approval. With respect to 
plans that have already been approved by shareholders, proxy statements contain information only for certain 
high level executives; they do not contain information on all the options granted during the fiscal year. 

8 We decided (arbitrarily) that the range could not be greater than ten percent of a simple average of the 
firm's stock price for the fiscal year. Firms with wider ranges were discarded. 

* To test the sensitivity of our results to this assumption, the market price of each sampled security as of the 
beginning of fiscal year 1985 and replicated our basic analysis. The results were virtually the same as those 
reported using midyear prices. 

1 To address the vesting-date issue, we also gathered the necessary data (variances, dividend yields, risk- 
free rates, and market prices) to apply the various option models to the sampled ESOs at two subsequent points 
in time—fiscal year-end 1985 and fiscal year-end 1986. l 
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date the ESOs are first exercisable; and (3) the service period—that time over which the 
executive performs services that are, at least in part, compensated with the ESO grant. 
The FASB proposal does not require that the service period coincide with either the 
time to expiration or the vesting period. We assume that ESO compensation cost should 
be amortized over the service period, in conformance with basic accrual accounting 
theory. This assumption also is consistent with the responses of chief financial officers 
(CFOs) to a letter requesting information about the amortization period (see sec. V}. The 
CFOs indicated that the amortization period related to their ESOs was neither the time 
to expiration nor the vesting period. 

Accounting Principles Board Opinion 25 states that the service period may be iden- 
tified in the plan itself, or it may be inferred. None of our sampled firms disclosed a ser- 
vice period in the stock option footnote. If the service period is to be inferred from the 
terms of the plan, it seems reasonable to assume that the service period is either the life 
of the option or the period from the date of grant to the date the options first become 
exercisable (which generally is the vesting date). The second alternative is consistent 
with the service period that is commonly employed for stock appreciation rights.'® 
However, we initially asssume that the service period is the life of the option itself. A 
long service period should bias our results against the possibility that ESO compensa- 
tion will materially affect the operating income figures of the sampled firms. We then 
analyze the impact of a change in the service period to the span of time from the date of 
grant to the date the option is first exercisable. 


IV. Variance, Dividend Yield, and Risk-Free Rate Calculations 


Various statements and evidence found in the literature (Black and Scholes 1972; 
Boyle and Ananthanarayanan 1977; Latane and Rendleman 1976; Merton 1976) 
indicate that bias in B-S model estimates can be significantly reduced by extending the 
time period over which stock returns are collected for variance calculations and by em- 
ploying returns occurring after the measurement date (post-grant observations). Yet, 
the Noreen and Wolfson (1981) results did not improve the predictive ability of the con- 
tinuous-dividend model with post-measurement date (analogous to post-grant date) 
return observations. These apparently conflicting results led us to base our calculations 
of variances on 60-day estimation periods after each valuation date." 

Dividend yields (represented as k) were estimated as the average quarterly cash 
dividend paid per share, for the year subsequent to each valuation date, divided by the 
price of the stock on the valuation date. Quarterly compounding was utilized for the 
present-value calculations of the FASB’s minimum-value model. Dividend yields for the 
B-S continuous-dividend model were also based on a quarterly k. In the application of 
the B-S continuous-dividend model, we used the natural logarithm of 1+k as our proxy 
for the continuous-dividend yield. The data required to calculate dividend yields were 
taken from the 1989 Center for Research in Security Prices (CRSP) daily master tape. 


1 See Interpretation No. 28, par. 3, FASB (1881). 

1 Variance rates were calculated with both post- and pre-grant return observations for 20-day, 40-day, and 
60-day periods giving six variance estimates. The pre-grant collection periods ended on the date of grant; the 
post-grant collection periods began on the date of grant. Compensation estimates from the B-S model were 
essentially the same, no matter which of the six return periods was used to calculate the variance rate. For the 
vesting-date analysis, we also employed variances based on 60-day returns beginning on each valuation date. 
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In accordance with one of the methods in Noreen and Wolfson (1981), the risk-free 
rate for the B-S models was estimated as the yield for a treasury note or bond that 
matures closest in time to the expiration date of the ESO. Treasury note and bond yields 
were taken from The Wall Street Journal. Yields on securities with deep discounts were 
ignored. Our proxy for the risk-free rate in the B-S models was the natural logarithm of 
1+r. The risk-free rate for the FASB’s minimum-value model was simply r. 


V. Results 
Comparison of Fair Value and Minimum Value Under the FASB Proposal 


Recall that the FASB proposal requires that the fair-value estimate of ESO compen- 
sation may not be lower than the estimate from its minimum-value model. Relating the 
FASB’s minimum-value estimate to the estimate yielded by the B-S model, figure 1 
shows that the minimum-value estimates were higher than the continuous-dividend 
estimates for 92 of the 170 firms that distributed cash dividends.: That is, 54 percent of 
the dividend-paying cases would have been affected by the minimum-value constraint. 
For firms that do not pay dividends, the FASB proposal is equivalent to estimation of 
ESO compensation with the B-S model because this model always generates larger esti- 
mates of ESO values than the FASB’s minimum-value model as seen in figure 1. For 
both types of samples, the B-S estimates would have been used in approximately 64 per- 
cent of the cases. Nevertheless, the FASB’s minimum-value model is potentially impor- 
tant since ESO compensation would have been calculated with this model in 36 percent 
of the cases. ”? 

We also used the Mann-Whitney U to test the similarity of the cumulative distribu- 
tion functions for the ESO compensation estimates from the FASB’s minimum-value 
procedure and the B-S model. The similarity of the two estimates (null hypothesis) 
could not be rejected for either the dividend-paying or the nondividend-paying sub- 
groups (the probability levels were 0.95 and 0.61, respectively). A reasonable tentative 
implication would be to discard the more costly model, the fair-value model, if the mea- 
surement date is the date of grant. 


Materiality 


For each firm, the compensation generated by géie the FASB proposal was 
divided by the life of the option, resulting in a yearly charge to compensation expense. 
Because the midyear grant date assumption results in the recognition of only half a 
year’s compensation expense in fiscal 1985, we focused on fiscal 1986. This annual 
ESO compensation chargeable to income was divided by the absolute value of the 1986 
operating income (or loss).”” The results of the 1986 compensation ratio are reported in 


figure 2. 


i8 Two of the dividend-paying firms had minimum-value estimates of zero because of large quarterly divi- 
dends. The corresponding B-S estimates for these two firms were positive. In only two instances did the appli- 
cation of the FASB proposal on the date of grant and current practice (for a sampled firm) yleld the same result. 

‘9 We replicated the above analysis by changing the life of every ESO to eight years and then to five years. 
Under these respective scenarios, 66 and 19 dividend-paying firms, would have had mininium-value estimates 
higher than the contnuous-dividend estimates. Given these conditions, the FASB’s minimum-value model 
would have been used.in 51 and 71 percent of the cases. In reality, 77 percent of the ees firms had option 
lives of ten years, and 20 percent set the option life at five years. The remaining firms set the time to expiration 
at either six or eight years. 

æ Twenty-four of the dividend-paying firms and 18 of the nondividend-paying firms had losses in 1988. 


602 The Accounting Review, July 1991 


. Figure 1 
Frequency Distribution of the Ratio of Minimum-Value to B-S Estimates 


Panel A. Dividend-Paying Firms (n=170): 
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Panel B. Nondividend-Paying Firms (n=44): 
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Note: This ratio is the compensation given by the FASB’s minimum-value procedure divided by the compen- 
sation yielded by the Black-Scholes model adjusted for dividends. 


Given a three percent materiality threshold, approximately eight percent of the divi- 
dend-paying firms would have disclosed material compensation expense under the 
FASB proposal. With a five percent materiality guideline, six percent of the dividend- 
paying firms would have disclosed material compensation expense. Comparing the two 
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Figure 2 | 
Cumulative Percentages of the 1986 ESO Compensation Rati 
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cases shows that the overall effect on income of the proposed FASB rule is higher for 
firms that do not pay dividends than for dividend-paying firms. With a three (five) per- 
cent materiality guideline, 30 (18) percent of the nondividend-paying firms would have 
disclosed material compensation expense.” Our initial results indicate that ESO com- 
pensation would have a material effect on operating income for most firms in our sam- 
ple, even when the amortization period is lengthy—the life of the option. In this context, 
firms that do not pay dividends are much more likely to have material ESO compensa- 
tion than the dividend-paying firms. 

There are two explanations for the differential effect on nondividend firms than on 
dividend firms. First, the B-S model adjusted for dividends or the FASB minimum-value 
model, for any given set of parameters, will always provide a value lower than that of 
the respective unadjusted version of the model. Thus, a nondividend firm that granted 
the same number of ESOs with identical terms as those granted by a dividend firm 
would record more compensation. Second, nondividend and dividend-paying firms 
might differ by the attributes that make ESOs a relatively more attractive form of execu- 
tive compensation. 

Change in Service Period. We have thus far assumed that the amortization period 
is the life of the option. Another feasible alternative is to consider the service period as 
the span of time from the date of grant to the first date the option is exercisable. We 
examined the stock-option footnotes of the ESOs in our sample and found that approxi- 
mately half of the sampled firms clearly disclosed when their ESOs were first exercis- 
able. In the footnotes that clearly stated the exercisability features, the two most 
frequently encountered were: (1) fully exercisable after a one-year waiting period (one- 
year service period), and (2) exercisable at 25 percent per year beginning one year after 
grant date (five-year service period). The potential for length of service period to have a 
large impact on our results motivated us to survey the CFOs of the 214 sampled firms, 
asking them to specify the probable service period if compensation were to be re- 
corded. The 47 usable responses indicated an average service period of approximately 
seven years.” 

To apply the FASB proposal, we calculated the charge to 1986 income for each sam- 
. pled firm assuming (1) a one-year service period, (2) a five-year service period, and (3) a 
seven-year service period.” We then divided the charge to income by the absolute value 
of 1986 income to get the 1986 ESO compensation ratio. 

Figure 3 presents the 1986 compensation ratios, under the FASB proposal for the 
above service periods. For each service period, we applied both a three percent and a 
five percent materiality threshold. The results for the five percent criterion are reported 


D We also recomputed 1886 ESO compensation ratios under the assumption that the life of every ESO 
was five years (and a service period of five years). Under this scenario, 11 (eight) percent of the dividend- 
paying firms would have disclosed material compensation expense under the FASB proposal with a three 
(five) percent materiality threshold. The comparable percentages for the nondividend-paying firms were 35 
and 25 percent. Note that these results are quite consistent with those of our basic analysis given above. 

2 Two CFOs voluntarily noted that their firm would change compensation devices if the FASB proposal 
were to be implemented. 

For the one-year service period, the full model value was charged to 1986 income. For a five-year service 
period, we calculated the charge to 1986 income following the amortization procedure illustrated in Interpreta- 
tion No. 28, example 2, and considered 52.08 percent of the model value as the charge to 1986 income. For the 
seven-year period, we calculated the charge to 1986 income as one-seventh of the model value. We ignored the 
midyear grant date assumption for all three service periods in order to assess materiality when a full year’s 
compensation is expensed in a given year. 
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Figure 3 
Cumulative Percentages of the 1986 ESO Compensation Ratio 
with Various Service Periods 
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in parentheses. For a one-year service period, 48 (35) percent of the dividend-paying 
firms would have disclosed a material charge to 1986 income.** These percentages de- 
creased to 29 (22) percent for the five-year service period and to ten (seven) percent for 
the seven-year service period. For nondividend-paying firms and a one-year service 
period, 70 (57) percent of these firms would have disclosed material charges to 1986 in- 
come. These percentages decreased to 50 (48) percent for the five-year service period 
and to 30 (23) percent for the seven-year service period. 

The overall results show that a variety of reasonable service periods shorter than 
the life of the option produce material ESO compensation for most firms in the sample. 
Over 20 percent of firms that did not pay dividends would have experienced a material 
charge to income, even if the amortization period was seven years. Additionally, a large 
percentage of the dividend-paying firms would have had material ESO compensation, 
even if options were exercisable at a rate of 25 percent per year. Note that these effects 
are conservative since our analysis assumes that each grant is an isolated occurrence. 
If, on the other hand, a firm grants ESOs each year, then the recorded compensation 
effects would accumulate. For example, the effect on income of a one-year service 
period (for a grant at the beginning of the fiscal year) is approximately equal to the in- 
come effect for a firm with, say, a three-year service period that granted approximately 
the same number of options each year. With respect to a firm that fell in the latter cate- 
gory, the total charge to ESO expense for a particular year would consist of one year’s 
amortization from two years ago, a year’s amortization from last year, and one year’s 
amortization from the current year. 


Grant Date Versus Vesting Date 


Given the most recent FASB proposal states that the measurement date is the date 
that the options vest, the FASB might opt to utilize a procedure similar to that currently 
required for variable-term stock options and stock appreciation rights (SARs). For 
both instruments, the measurement date occurs subsequent to the date of grant and 
compensation would be accrued from the date of grant to the vesting (measurement) 
date on the basis of the quoted market price at the end of each period. Cumulative com- 
pensation would be adjusted by a charge or credit to income each period until a final 
determination is ascertained on the vesting date. Cumulative compensation would not 
be adjusted below zero.” 


^ Incidentally, the exercisability feature that ranked third (following fully exercisable after one year and ex- 
ercisable in installments of 25 percent per year) was exercisable immediately on the grant date. If the ESOs are 
exercisable immediately, then the full model value would be charged to expense in the year of grant. With the 
assumption that operating income was about the same in 1985 and 1986, the one-year service period and exer- 
cisable-immediately feature would have approximately the same effect on income. l 

3 A telephone conversation with Don Delaney of the FASE staff (who subsequently conferred with board 
member Robert Swieringa) confirmed that it is likely that compensation will be accrued from the date of grant 
to the vesting date. Mr. Swieringa also noted that the method of spreading the compensation has not been 
agreed upon as yet. 

3 As an example, assume that Hypothetical Company grants 1,000 stock options to its executives in the 
middle of fiscal 1985. To obtain favorable tax treatment for its executives, Hypothetical sets the exercise price 
at $30, which was equal to the market price of Hypothetical’s common stock on the date of grant. Each option 
permits an executive to purchase one share of Hypothetical’s common stock for a ten-year period; however, the 
options cannot be exercised until the expiration of a two-year waiting period. After the two-year period, all of 
the options are fully exercisable. Also, assume that the service period is two years—the period from the date of 
grant to the vesting date. On the date of grant, Hypothetical applies the FASB proposal and arrives at a total 
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The vesting date is often the date ESOs are first exercisable; to ascertain this point, 
responses from the CFOs survey showed an average vesting period of 1.9 years from the 
date of grant.” To assess the income effects under this period, we assumed (1) that all of 
our firms had a vesting date that was two years after the 1985 midyear grant date, and 
(2) that the service period was the seven-year period indicated by the CFOs. We then 
gathered the necessary data to apply the FASB proposal to the 1985 grants for fiscal 
year-end 1986.” The number of options granted in 1985 and the exercise price were 
appropriately adjusted using the 1989 CRSP daily master tape for stock dividends and 
splits occurring from the date of grant to the vesting date. ESO compensation ratios 
recalculated for 1986% are presented in figure 4, comparing the grant date versus the 
vesting date for dividend-paying firms. If the measurement date is the date of grant, 13 
(eight) percent of these firms would have had material ESO compensation under the 
FASB proposal, with a three (five) percent materiality threshold. The vesting-date ESO 
compensation ratios were calculated with the SAR-related procedures discussed 
earlier. As shown in figure 4, greater income effects are obtained using the vesting date 
as compared to the grant date for the dividend-paying firms; approximately 21 percent 
of the dividend-paying firms would have disclosed material (at the three percent level) 





compensation of $3,000. On this date, an entry is necessary: Deferred Compensation is debited for $3,000, with 
an offsetting credit to Stock Options Outstanding. Assume that the market price of the common rises to $40 by 
the end of the fiscal year and the FASB proposal is reapplied to an option with a remaining life of 9.5 years. Also 
assume that the total compensation is now revised to $4,000. The entry on the date of grant is adjusted upward 
by $1,000. In addition, half a year’s amortization is required. The amortization entry involves a debit to Com- 
pensation Expense and a credit to Deferred Compensation. The charge of $1,000 to income for 1985 is calcu- 
lated by dividing the revised balance in the Deferred Compensation account ($4,000) by the two-year service 
period and adjusting for the fact that the options were outstanding for only half a year. At the end of 1986, the 
market price rises to $44, and another adjustment to the grant date entry is required. The FASB proposal is now 
applied to an option with a remaining life of 8.5 years. Assume that the total compensation at fiscal year-end 
1986 is $4,400. The Deferred Compensation and Stock Option accounts are adjusted upward by $400. In addi- 
tion, amortization for 1986 is required. The amortization amount for 1986 is $2,300, and is based on a cumula- 
tive calculation [($4,400 x 1.5/2)— $1,000]. At this point in time, Stock Options has a credit balance of $4,400 
and Deferred Compensation has a debit balance of $1,100. The final adjustment to the Deferred Compensation 
account for market price changes will occur on the vesting date, midyear 1987. Assume that the market price 
drops to $38 and the FASB proposal is reapplied at midyear 1987 to yield a total compensation of $3,800. Five 
hundred dollars of ESO compensation expense ($3,800 — $1,000 — $2,300) is charged for the first six months of 
1987. The adjusting entry also has a credit to Deferred Compensation for $1,100 and a debit to Stock Options for 
$600. 

" Some responses indicated graded vesting, such as 20 percent a year. These were converted to point esti- 
mates (2.5 years in this case) for the purpose of obtaining an average vesting period. 

8 A reviewer pointed out that the difference between the seven-year service period and the two-year vest- 
ing period seems inconsistent, since one of the feasible service periods is the time span from the date of grant to 
the date that the option is first exercisable. We note that the service period can be identified in the plan itself but 
not disclosed in the stock-option footnote. The service period could also be indicated in a related compensation 
contract when an ESO js granted to an executive. If the service period is explicitly stated in either manner, then 
such a large difference (approximately five years) would be feasible. Another possibility is that the CFOs con- 
fused the service period with the period of time that the option is outstanding. 

»? To calculate 1986 ESO compensation under vesting-date accounting, we needed to calculate ESO com- 
pensation as of fiscal year-end 1985. Half of a full year’s amortization was assumed in the computation of 1985 
ESO compensation because of the midyear-grant-date assumption. We present only the results for 1986, in 
which a full year’s amortization is assumed. 

» Comparison of grant-date valuation with vesting-date valuation is sensitive to market movement. Gen- 
erally low stock prices at the grant date and high stock prices at the end of fiscal year 1986, or the opposite, 
could have a confounding effect on our results. To allow for this possibility, we observed the changeintheS & P 
500 index for the calendar years 1985, 1986, and 1987. We found that the change in the S & P 500 index was posi- 
tive for each of the three years. This observation implies that we applied both grant-date and vesting-date 
accounting under similar market conditions, 
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Figure 4 
Effect of Grant Versus Vesting Date on 1986 ESO Compensation Ratio 
(Seven-Year Service Period, Two-Year Vesting Period) 
Panel A. Dividend-Paying Firms (n=170} 
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Panel B. Nondividend-Paying Firms (n=44): 
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Note: The ratio is the estimate of 1986 compensation expense (calculated under the FASB proposal on each 
measurement date) divided by the absolute value of 1986 operating income, 
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compensation expense under the FASB proposal using the vesting date (16 percent 
under a five percent materiality guideline). A Mann-Whitney U-test showed no signifi- 
cant differences (p=0.33) between the cumulative distributions of the ESO compensa- 
tion ratios computed on the grant date and those computed on the vesting date. This re- 
sult suggests that grant-date accounting would be appropriate on several grounds: it is 
less costly; it is consistent with option-pricing theory; and it entails no loss in informa- 
tion content. E l 
ESO ratios for 1986 using the vesting date and the date of grant for the nondivi- 

dend-paying firms are also reported in figure 4, which shows a lower impact on income 
using vesting-date than using grant-date accounting. Specifically, 29 (20) percent of 
these firms would have disclosed material compensation expense with a three (five) 
percent materiality guideline, which is compared to 39 (25) percent using the grant 
date. As with the previous case, the Mann-Whitney U-test showed no difference at the 
0.30 level. This result also supports the tentative conclusion that grant-date accounting 
is preferable to vesting-date accounting from the perspective of cost of application.*! 


VI. Concluding Comments 


Expense recognition of ESOs is likely to have a material impact on operating in- 
come irrespective of the option-valuation method employed for firms that do not pay 
dividends, whether the measurement date is the grant date or the vesting date. A mate- 
rial effect also appears likely for short service periods, whether or not the firm is divi- 
dend-paying and whether or not grant-date or vesting-date accounting is used. 

From an overall! perspective, the B-S model is likely to be the model used to imple- 
ment the FASB proposal. In general, we conclude that grant-date accounting is appro- 
priate because it is conceptually the more appropriate measurement date. In addition, 
from a practical perspective, grant-date accounting is likely to be less costly to imple- 
ment because the SAR-related procedure need not be utilized and because application 
of the fair-value model might be disregarded. 

We offer this paper as an example for conducting empirical research on accounting 
issues of interest to the FASB prior to concluding the standard setting process. 


3 Recall that with grant-date accounting, the B-S model would be used in the application of the FASB pro- 
posal by approximately 64 percent of the sampled firms. Our results indicated that, when the FASB proposal 
was applied at the end of fiscal 1986 with vesting-date accounting, the minimum-value estimates were much 
lower than similar minimum-value estimates generated on the grant date. Specifically, under vesting-date 
accounting, 42 of the 170 firms that distributed cash dividends showed minimum-value estimates that were 
higher than the B-S estimates. From an overall perspective, under vesting-date accounting, B-S estimates 
would have been used in approximately 80 percent of the cases. This effect occurred primarily because of stock 
price declines between the date of grant and the end of fiscal 1986. We calculated the ratio of the FASB’s 
minimum-value estimate to the B-S estimate for the 214 sampled firms on the grant date as well as the vesting 
date. The mean ratio for the grant date was 0.9638, with a standard deviation of 0.1952. The average ratio for 
the vesting-date calculations was 0.7283, with a standard deviation of 0.3260. Twenty of the 214 firms had min- 
imum-value estimates of zero for the 1986 vesting-date calculations. Two firms had minimum-value estimates 
of zero for the grant-date calculations. We also conducted a Mann-Whitney U-test for the differences in the 
ratio of the FASB’s minimum-value estimate to the B-S estimate applied on the grant date versus the vesting 
date. We rejected the null hypothesis of no difference at the 0.001 level. Thus, for the sampled firms, B-S 
estimates are much more likely to be utilized In the FASB proposal when the measurement date is the vesting 
date versus the grant date (with a seven-year service period and stock prices that have been falling). Addi- 
tionally, we conducted another Mrun-Whitney U-test for both the dividend-paying and the nondividend- 
paying subgroups. The null hypothesis was that the cumulative distribution functions for the ESO com- 
pensation estimates from the FASB’s minimum-value procedure and the B-S model were the same. The null 
hypothesis was rejected for both the dividend-paying and the nondividend-paying subgroups (the probability 
levels were 0.0581 and 0.0087, respectively). Thus, if vesting-date accounting is used, it seems less likely that 
the fair-value model could be disregarded in calculating ESO compensation. 
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N the remainder of this article, section I describes the potential determinants of the 

stock, bond, and firm value effects of SFAS 94 and concludes with statements of 

testable hypotheses. Section II outlines the research design and the data sources to- 
gether with the basic test strategy used. Section II contains empirical results, and sec- 
tion IV provides a brief summary. 


I. Potential Effects of SFAS 94 
No-Effects Hypothesis 


Under the no-effects hypothesis advanced in the accounting literature, a mandated 
accounting change does not change the affected firm’s market value. This irrelevance 
proposition, based on perfect and frictionless markets, implies that no security price 
changes are associated with the deliberation or regulatory announcements pertaining 
to a mandated accounting change. 

When applied to firms affected by SFAS 94, this hypothesis gives rise to two asser- 
tions. First, consolidated reports are not expected to alter the market’s assessment of a 
parent’s operations, risk, or ability to raise funds because market participants had all 
along viewed the parent and subsidiary operations as a single entity. For example, 
Comiskey et al. (1987) find that the market incorporates the unconsolidated subsidiary’s 
debt in its assessment of the parent’s risk. Second, forced consolidation under SFAS 94 
does not preclude the formation or use of nonhomogeneous subsidiaries because such a 
decision is linked to the firm’s business decisions. For example, Ronen and Sondhi 
(1989) argue that the rationale for creating finance subsidiaries is not related to ac- 
counting consolidation rules. Thus, in an efficient market, ceteris paribus, neither the 
value of the common stock nor the value of the debt will change as a result of SFAS 94. 


Redistribution-Effects Hypothesis 


An alternative explanation implies that the unanticipated announcement of a man- 
dated accounting change that alters financial arrangements results in a wealth transfer 
between the affected firm’s stockholders and debtholders by changing the risk of debt. 
Adoption of SFAS 94 is not expected to affect the financial arrangements of firms that 
were already consolidating (hereafter cs firms); therefore, no change in the equity value 
or the debt value of these firms is predicted. However, financial arrangements of firms 
reporting consolidation due to SFAS 94 (hereafter Cs firms) may be affected by limiting 
their ability to issue additional deht" As an example, Tenneco illustrated the impact of 
SFAS 94 on financial arrangements in its comment letter to the FASB: 


Consolidation of finance subsidiaries would require amendments in order for Tenneco 
to borrow long term debt.... These indentures protect the financial interests of thou- 
sands of holders of more than 40 different issues of Tenneco public notes and inden- 
tures. In addition, they are critical to maintaining revolving lines of credit with 46 U.S. 
banks and 35 foreign banks, a master note borrowing facility with four banks, and com- 
mercial paper issuance.... There is a possibility that creditors could demand higher 
rates of interest as the price to amend documents. If this were to occur and the increase 
was only 25 basis points, annual costs to Tenneco would exceed $25 million. ° 


! Livnat and Sondhi (1986) find that the debt covenants of 54 percent of their sample firms with unconsoli- 
dated finance subsidiaries had restrictions on additional debt in the form of balance-shest data. Similarly, Mian 
and Smith (1990) report that covenants for firms with unconsolidated subsidiaries employ unconsolidated bal- 
ance-sheet data. 
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To the extent that SFAS 94 restricts a company’s ability to issue additional debt, it 
can be shown that in some situations returns to the existing debtholders will be higher 
than their pre-SFAS 94 returns, while under other conditions, their post-SFAS 94 
returns will never be lower than their pre-SFAS 94 returns (see appendix A). Spe- 
cifically, when the pre-SFAS 94 distribution of returns available to existing debtholders, 
is stochastically dominated by the post-SFAS 94 distribution, the value of debt will in- 
crease. With no change in firm value, the increase in the value of debt implies a decline 
in the value of common stock. Thus, under the redistribution-effects hypothesis, the 
equity value will decline while the debt value will increase for ¢s firms. 


Cash Flow Effects Hypothesis 


That mandated accounting changes can reduce a firm’s expected cash flows is an 
explanation based on higher transaction costs, increased agency costs, and a reduced 
opportunity set of accounting, investment, financing, and production decisions. Under 
this hypothesis, the market value of common stock declines, but the market value of 
debt does not increase.’ 

In the context of SFAS 94, the reduction in a firm’s expected cash flows due to the 
three factors mentioned above has been a subject of continued discussion and debate. 
For both cs and the Gs firms, SFAS 94 eliminates an accounting alternative, thereby re- 
ducing the opportunity set of accounting decisions. The cs firms that lobbied against 
SFAS 94 have identified some of the adverse operating implications of the proposed. 
change. For example, Ford Motor Company argued that the implementation of SFAS 
94 would require modifications of incentive or compensation plans to retain the origi- 
nal objectives of the program, which results in legal and administrative costs to rewrite, 
renegotiate, and approve these plans. Similarly, firms in the construction industry 
noted that the impact of SFAS 94 on balance-sheet ratios might preclude fiduciary bond 
companies from providing bid and performance bonds on future contracts. 

The financial press {e.g., American Banker, August 10, 1987) speculated that forced 
consolidation may cause parents with previously unconsolidated finance subsidiaries 
to incur financing costs in selling the loans held by their finance units. Sprouse (1988) 
reports that the reorganization plan undertaken by Tenneco Inc. is in response to SFAS 
94. It has been suggested that Tenneco undertook this action, in part, because the reor- 
ganization costs were estimated to be lower than other transaction costs such as floata- 
tion costs for recapitalizing old debt. If transaction costs arise from adoption of SFAS 
94, the cash-flow hypothesis implies that there will be a decline in common stock 
values, and a nonpositive effect on debt values for cs firms. 

Based on the previous discussion, three testable hypotheses that focus on the pat- 
tern of common stock and debt responses to the issuance of SFAS 94 can be formally 

‘stated as follows:? 


Ha: Ceteris paribus, there is, on average, a nonzero effect on the wealth of stock- 
holders of cs and Gs firms due to the issuance of SFAS 94. 


2 Handjinicolaou and Kalay (1984) demonstrate graphically the changes in the market values of equity and 
debt that are associated with reduction in the market value of the firm. 

3 Khurana (1989) examined ten events pertaining to SFAS 94 deliberations and policy announcements dur- 
ing 1985-1987 and found evidence of significant changes in expectations for the sample firms only around the 
issuance of SFAS 94. Therefore, in judging the appropriateness of the significance levels used, the reader may 
consider the fact that nine of the ten events examined were insignificant. 
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H.x Ceteris paribus, there is, on average, a nonzero effect on the wealth of debt- 


holders of cs firms due to the issuance of SFAS 94. 


H.: Ceteris paribus, there is, on average, a positive effect on the wealth of debt- 


holders of eg firms due to the issuance of SFAS 94. 
À 


II, Research Methods 


Sample Selection and Data Sources 


Sample firms were selected: by using the following procedure: 


1. 


= 


The initial sample consisted of firms listed in American Bankers’ annual survey 
of the finance company industry dated 13 June 1986.‘ The survey ranked finance 
companies by total capital funds (stockholders’ equity plus noncurrent subordi- 
nated debt) as of 31 December 1985 or the nearest fiscal year end. A total of 136 
finance companies are listed as captive, affiliate, or independent firms. A cap- 
tive primarily finances the sale of the parent’s products or services; an affiliate is 
a subsidiary of another company but not a captive; and an independent firm is 
neither a captive nor an affiliate. The survey listed 36 captives, but did not 
segregate the remaining 100 companies into affiliates or independent firms. 
The Directory of Corporate Affiliations (1986) was used to identify the ownership 
status of firms listed in the affiliate or independent category. Seventeen firms 
were not classifiable with this source, but nine of these were classifiable follow- 
ing direct contact. Of the 100 firms in the affiliate or independent category, 74 
were classified as affiliates, 18 as independent, and eight were unclassifiable. 
This directory was also used to identify the ultimate parent of the captive and 
affiliate companies. Six parents owned at least two captive or affiliate compa- 
nies. 


. The National Automated Accounting Retrieval Service (NAARS) and Moody’s 


Industrial, Public Utilities, Bank and Finance or Transportation (1988) manuals 
were checked to ensure that the captive or the affiliate was majority-owned by 
the parent identified in the American Bankers’ Survey as of 31 December 1985 or 
the nearest fiscal year. 


. Parents were excluded if they were not on the Center for Research on Security 


Prices (CRSP) daily returns tapes during the test period covering 717 trading 
days ending with the issuance of SFAS 94. This elimination resulted in 72 parent 
companies that constitute the “stock” sample. 


An examination of NAARS and the Moody’s manuals indicated that 36 of the 72 
stock sample firms had consolidated their finance subsidiaries as of 31 December 1985 


“ Although SFAS 94 affects the consolidation policy of parents with majority-owned finance, insurance, 
real estate, and leasing subsidiaries, the initial sample for this study focused only on the finance company 
industry, for two reasons. First, Mohr (1988) and Heian and Theis (1989) have documented that the magnitude 
of the effects of SFAS 84 on accounting numbers are likely to be material for parents with unconsolidated fi- 
nance subsidiaries. Second, Mian and Smith (1990) find that Fortune 500 companies with majority-owned fi- 
nance subsidiaries also own other majority-owned insurance, leasing, and real estate subsidiaries. Similar 
analysis for the stock sample used in this study indicates that 50 of the 72 parent companies held majority 
ownership not only in finance subsidiaries but also in insurance, leasing, or real estate subsidiaries. Thus, by 
focusing on the finance company industry, it was possible to identify parent companies that were likely to have 
more pronounced, and, a priori, more readily observable, economic consequences arising from SFAS 94. 
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or the nearest fiscal year, and that the remaining 36 firms used the equity method to 
account for such subsidiaries. Of the 72 stock sample firms, 33 are in the financial ser- 
vices industry, which showed some evidence of industry commonality in the choice of 
consolidation policy. The financial services industry consolidates its finance subsidi- 
aries more (30 consolidated vs. three unconsolidated) than other industries (machinery, 
petroleum and rubber, transport and communication, and transportation equipment), 
where finance subsidiaries were generally unconsolidated (23 unconsolidated vs. one 
consolidated). This difference is consistent with Mian and Smith (1990). 

To test the hypotheses with respect to debt, publicly traded nonconvertible debt 
issues of firms in the stock sample were identified from the Interactive Data Corpora- 
tion (IDC) data base. This resulted in a sample of 22 debt securities of 14 cs firms and 75 
debt securities of 27 ts firms. Debt closing prices from 14 September 1987 to 12 Decem- 
ber 1987 were retrieved from the IDC data base. The amounts and dates of interest pay- 
ments were obtained from Standard and Poor’s Bond Guide (1987). The Wall Street Jour- 
nal Index was used to obtain daily observations of the Dow Jones Bond Index, which 
comprises ten industrial and ten utility bonds. 


Research Design 

This study uses an interrupted time-series design to study the effects of SFAS 94 on 
the wealth of security holders. The announcement period for the issuance of SFAS 94 
covers a two-day trading period encompassing The Wall Street Journal publication day 
(2 November 1987) and the preceding trading day (30 October 1987). Since this an- 
nouncement period is close to the 19 October market crash, market model misspecifica- 
tion around the issuance of SFAS 94 is examined for equally weighted portfolios con- 
taining randomly selected equity securities. 


Change in Stockholders’ Wealth 


The change in stockholders’ wealth due to the issuance of SFAS 94 is operational- 
ized as the unexpected excess stock return during the announcement period for equally 
weighted portfolios of cs and cé firms, This unexpected excess stock return, Z,, for each 
portfolio is estimated by using equation (1) over 717 trading days ending 2 November 
1987. 


5, = dy + ba Bank SD, Lë (1) 
where: 


f,,=rate of return on an equally weighted jth portfolio of common stock on day t, 
Tm: =Tate of return on the CRSP equally weighted index on day t, 
D,=1 for the announcement period of SFAS 94, 0 otherwise, 
- e „=residuals for the jth common stock portfolio on day t, 
Go, = estimated average intercept, 
bo, =estimated average beta for contemporaneous market variable, 
Z,=estimated unexpected excess stock return for the jth portfolio at the issuance 
of SFAS 94, and 
t=1,2,...,717 days. 
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Change in Debtholders’ Wealth 


The change in debtholders’ wealth due to the issuance of SFAS 94 is operational- 
ized as the cumulative market-adjusted returns during the announcement period for 
equally weighted portfolios of debt securities of cs and cs firms. The market-adjusted 
returns, MAR,,, for each debt security i on day t is calculated as: 


MAR,,.=R;.— Res (2) 
where: 


R,,=return (including accrued interest) of debt security i on day t, and 
R,,=return on the Dow Jones Bond Index. 


If debt security i is not traded on day t, then R,, is computed with the jump returns 
model (Eger 1983).° Returns of multiple debt securities issued by the same firm are aver- 
aged and treated as a single debt issue with a single return. 

The market-adjusted returns, MAR,,, are averaged across the N, firms that have at 
least one debt security with a trade on day t to form an average daily market-adjusted 
return. These cross-sectional average daily market-adjusted returns are summed for 
days —1 and 0 to yield a cumulative market-adjusted return for the SFAS 94 announce- 
ment period, 


' 0 . 
CMAR,= E MAR,. (3) 
j=-1 
In accordance with Dennis and McConnell (1986), the statistical SE of the ` 


average daily. market-adjusted return on day t is determined by using a t-statistic com- 
puted as: 


t=MAR,/(o,/VN, ), | | (4) 


where: Eine) =— SS (Zman JI "` | (5) 


is the cross-sectional standard deviation of market-adjusted returns on day.t. To ana- 
lyze the two-day announcement period return, the day —1 and day 0 market-adjusted 
returns are summed for each security and a t-statistic is calculated according to 
equations (4) and (5). This statistic is distributed EE to the t-distribution with 
N,—1 degrees of freedom. 


III. Empirical Results 


Stockholders’ Wealth Effects 


Results of the test of stockholders wealth are reported in panel A of table 1. Param- 
eter estimates of Z indicate the average abnormal returns estimated from equally 
weighted portfolios of cs and Cs firms separately. The average abnormal return for the 


5 This model assumes that H, is zero if no trade occurs on day t. If a trade does occur on day t after a gap of n 
calendar days, then the return associated with the price change on day t is assigned to day t. In addition, R,, is 
also compounded to reflect the same trading interval as R,. 
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Table 1 
Test of Nonzero Effect of SFAS 94 on Stockholders’ Wealth (Hypothesis 1) 








Panel A. Estimated Model (F, =0y + bym tZ, D, +6;): 


cs Firms CG Firms 
Parameter Estimate t-Statistic* Estimate t-Statistic* 
Ay . 0.0001 0.28 — 0.0001 — 0.12 
bo 1.0520 69.53*** 1.1231 56.25%** 
Z — 0.0071 — 2.46** — 0.0157 —4.07*** 


Panel B. Model Specification Statistics: 


cs Firms ts Firms 
F (2, 714 df) 2500.20*** 1600.02*** 
Adjusted R? 0.88 0.82 
Partial R? between F, and D, given Fw 0.01 0.02 
White’s model specification chi-square test 
(3 df) 2.01 3.64 


All terms are defined in the text. 
* Degrees of freedom are 714. Significance levels are for a two-tailed test. 
** Significant at the five percent level. 
*** Significant at the one percent level. 
cs = Firms that consolidated all majority-owned jubeialating prior to SFAS 94. 
čs = Firms that did not consolidate all majority-owned subsidiaries prior to SFAS 94. 


cs portfolio around the issuance of SFAS 94 is —0.71 percent, which is significant at 
the 0.05 level. In contrast, the average abnormal return for the Cs portfolio is — 1.57 
percent, which is significant at the 0.01 level. The Chow (1960) F-test was used to evalu- 
ate differences in the parameter estimates of Z for the two portfolios. The results in- 
dicate that the average abnormal return for the Cs firms is significantly lower {p-value 
= (0.08) than that of cs firms, which suggests that the abnormal returns vary across firms 
depending on the consolidation policy used prior to the issuance of SFAS 94. 

Panel B of table 1 reports specification statistics of the estimated models; which are 
significant at the 0.01 level. The adjusted R? are high because the market factor is 
included in the estimation, but the coefficient of partial determination indicates that 
the dummy variable, D,, explains about two percent of the variation in the portfolio re- 
turns of Gs firms. In addition, White’s (1980) chi-square test indicates that the assump- 
tion of constant residual variance is not violated. 

Additional tests were performed to examine the sensitivity of the portfolio results to 
industry concentration, outliers, and the closeness of the SFAS 94 announcement pe- 
riod to the 19 October 1987 market crash. The analysis was first repeated on two sub- 
samples of the cs portfolio and a subsample of the Cs portfolio.‘ In the cs portfolio sub- 
sample that excludes all financial service firms (N=6), the parameter estimate of Z is 
negative (— 0.0039) but insignificant at the 0.10 level. In the second cs portfolio subsam- 


6 Thirty of the 36 cs firms are in the financial services industry, and 21 of them are banks. Only three of the 
CS firms are financial service firms and none is a bank. 
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ple that excludes banks only (N= 21), the parameter estimate of Z is significantly nega- 
tive (—0.0118) at the 0.01 level. Because each of these two subsamples is smaller than 
the entire sample, shifts in the parameter estimates and significance levels are not sur- 
prising. In general, these results suggest that the observed average stock market reac- 
tion around the issuance of SFAS 94 for the cs firms is not attributable solely to banks. 
However, the Z estimate of —0.0158 for the subsample of ce firms that excludes all 
financial service firms {N=33) is significant at the 0.01 level. These values are essen- 
tially the same as those for all ce firms. 

Nonparametric tests were conducted to check if outliers were driving the portfolio 
results. The estimated values of individual firm Z parameters are negative for 67 per- 
cent of the cs firms and 75 percent of the Cs firms: The Wilcoxon signed-rank tests 
indicate significant {p-value =0. 05) negative returns for the cs and Cs firms around the 
issuance of SFAS 94. 

The impact of market irregularities following the 19 October aai crash was 
examined by forming 100 equally weighted (control) portfolios, each containing 72 
securities. These portfolios were formed by selecting securities at random and with 
replacement from a population consisting of all securities for which daily returns were 
available on CRSP tapes for 717 trading days ending 2 November 1987. To replicate the 
earlier results, the returns on each of these portfolios were regressed on the CRSP 
equally weighted index and a dummy variable for the SFAS 94 announcement period. 
The mean and standard deviation of the t-statistics for the parameter estimates of Z are 
0.044 and 1.845, respectively. The mean statistic is small relative to the standard devia- 
tion of these estimates, and the proportion of positive t-statistics is 54 percent. Thus the 
t-tests for the parameter estimates of Z for the random (control) portfolios do not appear 
to be confounded by market irregularities following the 19 October market crash, there- 
by suggesting that there is no reason to suspect that the results for the cs and ts firms 
are driven by the crash. 

In summary, the results do not support the no-effects hypothesis predicting no ef- 
fect on the wealth of the stockholders of cs and Gs firms around the issuance of SFAS 
94. However, the significant negative excess returns for both the cs and ¢s portfolios 
around the issuance of SFAS 94 are consistent with the cash-flow effects hypothesis 
and suggest that the market anticipated reduction in firm value of cs and CS firms, pos- 
sibly because of transaction costs. 


Debtholders’ Wealth Effects 


Results of tests examining whether there is, on average, an effect on the wealth of 
the debtholders of cs and cs firms around the issuance of SFAS 94 are reported in table 
2. Results for the debt of cs firms are presented in panel A. The day — 1, day 0, and two- 
day market-adjusted returns are —0.75 percent, —0.76 percent, and —1.51 percent, re- 
spectively, with corresponding t-statistics of —0.19, —0.34, and —0.18, which do not 
permit rejection of the null hypothesis of zero abnormal returns for the cs firms’ debt. 
Panel B presents results for the debt of ¢s firms. The market-adjusted returns on day —1 
and day 0 are —0.05 percent (t= —0.03) and —0.95 percent (t= —0.20), respectively. 
The two-day cumulative market-adjusted return of —1 percent has a t-statistic of 
—0.10. None of the statistics reported allows rejection of the null hypothesis of non- 
positive abnormal returns accruing to the debtholders of the ¢s firms at the 0.10 level. 

The average market-adjusted returns reported in table 2 for day —1 and day 0 may 
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Table 2 


Tests of the Effect of the Issuance of SFAS 94 on the 
Wealth of Debtholders (Hypotheses 2 and 3) 


Percentage of 
N with 
Average Average Negative 
Average Market Market- Market- 
Event Raw Index Adjusted Adjusted 
Day f Returns (%) N Returns (o Returns (%) Returns (%) 
Panel A. cs Firm Debt: 
bg ~Q.13 5 0.06 -0.19 80 
—1 ~ 0.69 3 0.08 —0.75 67 
0 — 0.27 4 0.49 — 0.76 75 
1 0.68 5 0.51 0.17 60 
2 1.06 4 1.32 —0.26 75 
Around day 0°. 1.28 9 1.48 — 0.20 | 55 
Panel B. & Firm Debt: 
—2 0.93 11 0.29 0.64 45 
set 0.19 10 0.24 — 0.05 20 
0 0.38 11 1.33 — 0.95 72 
1 0.44 10 0.23 0.21 30 
2 0.18 10 0.89 bel (KA) 50 
Around day Or 0.57 22 2.33 — 1.76 77 


* This period is defined as the interval from the last trading day prior to day — 1 through the first trading 
day on or after day 0. 
N= Number of firms with debt securities that traded on event day. 
cs= Firms that consolidated all majority-owned subsidiaries prior to SFAS 94. 
Gs Firms that did not consolidate all majority-owned subsidiaries prior to SFAS 94. 


represent an incomplete measure of the debt response to the issuance of SFAS 94 be- 
cause all debt securities do not trade on these days. To allow for this possibility, the ini- 
tial postannouncement percentage of market-adjusted return for each security was cal- 
culated. The initial postannouncement period was defined as the last trading day prior 
to day —1 through the first trading day on or after day 0 for which the debt security 
prices were available. While the portfolio of 21 debt securities (representing nine cs 
firms) experienced an initial postannouncement market-adjusted return of —0.20 
percent, the portfolio of 70 debt securities (representing 22 Cs firms) experienced a loss 
of —1.76 percent. The mean initial postannouncement market-adjusted returns are 
negative but statistically insignificant. Furthermore, 55 percent of the cs firm debt secu- 
rities showed negative abnormal returns, while 77 percent of the Cs firm debt securities 
_ were negative. , 

In summary, the results indicate that statistically significant wealth increments are 
not realized by debtholders. The combined results of stock and debt analysis are incon- 
sistent with the redistribution effect and suggest that the cash-flow effects of SFAS 94 
dominate. 
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IV. Summary 


Results indicate that the issuance of SFAS 94 is associated with significant negative 
excess stock returns and insignificant negative returns for the debt of cs and Cs firms. 
One possibility is that the market was unable to infer the FASB position on the consoli- 
dation issue until the actual issuance of SFAS 94. In terms of Leftwich’s (1981) model of 
the valuation effects of a mandated accounting change, the excess returns can be 
interpreted as reflecting investors’ revised expectations regarding cash flows conse- 
quent to SFAS 94. In addition, the negative excess stock returns observed at the issu- 
ance of SFAS 94 vary across firms depending on the consolidation policy used prior to 
the issuance of SFAS 94. This does not imply that the consolidation policy itself is the 
cause of the differential performance since the choice of pre-SFAS 94 consolidation 
policy might proxy for other business attributes. Overall, the stock results suggest cash- 
flow effects as a plausible explanation for the negative wealth effect associated with the 
issuance of SFAS 94. 


Appendix A 


The purpose.of this appendix is to demonstrate the possible redistribution effects of SFAS 94 on the 
wealth of debtholders. The key components of the model are as follows. 


1. There is only one period in the model, and it begins at time t=0 when risky debt is issued with a prom- 
ise to provide gross dollar returns, €. Owners of this debt are referred to as the existing debtholders. 
The bond covenants allow the firm to promise Y, as gross dollar returns to the new debtholders, The 
E distribution, Vo, owned by the existing debtholders at time t=0 when risky debt is issued is as 
follows: 


Jory ifxz2Y+Y,, 
or SCC 
Vo= F(X) if F< Y+Y,, 


where 0<F,<1, Fo= ¥/(Y+Y,), and §=firm earnings. 

2. At time t=1, the passage of SFAS 94 affects the financial arrangements between securityholders. That 
is, the firm can now promise only Y/ (<Y,) as gross dollar returns to new debtholders. The return dis- 
tribution, §,, owned by the existing debtholders at time t=1 when SFAS 94 is issued is as follows: 


i=? if 22 9+Y,, 
or 
pes PI (Sei, 


where 0<F,<1, Fo<F,, and Es HIERZ) 

3. All securities are valued only on the basis of the cash earnings distribution available to the owners. 

4. Firm’s earnings Ho, gross dollar returns before interest), X, and market value remain unchanged as a 
result of SFAS 94. 

5. Thus the differences in the return distributions owned by the firm’s existing debtholders at time t=0 
and time t=1 will be: 


go Bass if x2Vr YZ, 
6, Bess (IN, VV t+ eve yy tk <4 PY, 


It follows that existing debtholders’ returns at time t=1 can never be less than the returns at time t=0 
and that they will be more whenever ¥<Y+/Y/. That is, Yo will be stochastically dominated by J, and the 
market value of the bonds will rise. 


or 
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SYNOPSIS: Professional standards for analytical procedures call for audi- 
tors to hypothesize likely causes of unexpected patterns in financial state- 
ment balances and to develop plans to investigate (SAS 56). Therefore, 
audit efficiency and effectiveness depend on competency in recognizing 
patterns in financial data and in hypothesizing likely causes of those pat- 
terns to serve as a guide for further testing. Yet our knowledge of how well 
auditors accomplish these crucial steps has been limited by the difficulty of 
developing research tasks with criteria for evaluating auditor performance. 
The purpose of this study is to relate processes of pattern recognition and 
hypothesis generation to quality of performance in an analytical task. 

To accomplish this purpose, we conducted a laboratory study in which. 
21 auditors were asked to think aloud while performing an analytical 


. procedures task. The case contained a seeded error that caused a fairly 


complex pattern of discrepancies between projected and unaudited finan- 
cial ratios and balances. Think-aloud verbal protocols provided a trace of 
subjects’ reasoning processes, and the seeded error provided an outcome 
criterion to evaluate the hypotheses they generated. This design supplies 
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evidence needed to evaluate pattern recognition and hypothesis generation 
processes that lead to correct and incorrect outcomes. 

Results indicated that three auditors made acquisition errors and four 
failed to combine crucial cues into a pattern. Of the 14 auditors who 
recognized the pattern, six proposed a hypothesis consistent with the pat- 
tern. Thus, hypothesis generation was the stage at which process errors 
most frequently occurred, and the least experienced auditors had the most 
difficulty at that stage. Specific process problems that inhibited generation 
of a correct hypothesis included: addressing only part of a recognized 
pattern, and/or fixating on a certain error type. Thus, protocol analysis 
allowed insight into which stages in the process proved most difficult for - 
the auditors and why, providing a focus for further research and decision 
aid development. 

This research responds to the call for inclusion of criterion outcomes in 
audit judgment research (e.g., Libby 1981; Ashton 1982; Biggs 1985; Davis 
and Solomon 1989). The importance of criterion outcomes can be seen in 
the advances made in decision research in medicine (e.g., Johnson et al. 
1982; Patel and Groen 1986; Lesgold et al. 1988). and physics (e.g.,Chi et 
al. 1982; Robertson 1990). In medicine especially, the ability to relate diag- 
nostic processes to outcome quality has contributed to the theory and 
practice of medical decision making (Kassirer 1989) in ways that have been 
unavailable in auditing. 


Key Words: Analytical procedures, Pattern recognition, Hypothesis gener- 
ation, Verbal protocol analysis, Auditor judgment. 


Data Avallability: 7he authors are conducting further analysis of the pro- 
tocol data. Requests for the data should be accompa- 
nied by a description of intended uses. 


HIS study analyzes how decision processes of auditors are associated with quality 

of performance in analytical procedures. The paper is organized into four sec- 

tions: (1) a review of background literature and development of the research ques- 
tions, (2) a description of the task and research methods, (3) results, and (4) limitations 
and conclusions. 


I. Background 


The recognition of patterns and generation of hypothesized causal explanations 
discussed in professional standards on analytical procedures (SAS 56) are also empha- 
sized in the Einhorn and Hogarth (1982, 1987) theory of decision making.' For example, 
they state that searching for causes of events “involves looking for patterns, making 
links between seemingly unconnected events, (and) testing possible chains of causation 
to explain an event” (Einhorn and Hogarth 1987, 66). Below, we briefly discuss litera- 


! Margolis (1987) extends this idea, proposing that pattern recognition is the basis of all cognition. 
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ture related to pattern recognition and hypothesis generation and discuss how our re- 
search extends prior studies of auditors’ analytical judgments. The section ends with 
research questions addressed. ` 


Pattern Recognition 


Pattern recognition has been studied extensively in psychology (primarily sensory 
perception) and artificial intelligence (pattern matching algorithms). In accounting, the 
study of pattern recognition has predominantly involved intuitive time-series extrap- 
olation of financial trends (Eggleton 1976, 1982; Biggs and Wild 1985).? Pattern recogni- 
tion in analytical procedures also involves recognizing relationships among pieces of 
financial information, where a concept or causal agent underlies the relationship. 
While this is similar to intuitive time-series extrapolation, it is more complex since inte- 
gration of data from multiple accounts is required. In that way, it is more like recog- 
nition of patterns of symptoms by physicians (Johnson et al. 1982) or recognition of 
board configurations by chess masters (Chase and Simon 1973). 

There is some evidence that auditors may have difficulty in recognizing patterns in 
financial data. First, pattern recognition may not be emphasized in the practice of ana- 
lytical procedures. While there is little research on this issue, survey evidence indicates 
that fairly simple methods are used in practice (e.g., Biggs and Wild 1984; Tabor and 
Willis 1985). This suggests that pattern search and analysis are also not common in 
practice. Second, Biggs and Wild (1985) found that auditors had difficulty in accurately 
extrapolating more complex time-series data. Third, early empirical research in other ` 
auditing tasks (reviewed in Ashton 1982 and Libby 1981) has suggested that auditors do 
not readily combine information into patterns in making judgments. More recently, 
however, researchers such as Schepanski (1983) and Brown and Solomon (1991) have 
questioned whether these early policy-capturing studies were sufficient to determine 
use of pattern-based (or configural) processing. 


Hypothesis Generation 


Hypothesis generation is the process of developing a proposal about an underlying 
event or principle that explains a recognized pattern of data. Schank (1986) provides 
theory on the cognitive processes involved in explanation. Schank states that the pro- 
cess of generating explanatory hypotheses involves attempted matching of observed 
patterns with those stored in memory. If a pattern of observations has been seen before, 
that event can be easily explained “as an instance of an earlier event of a similar kind” 
(1986, 47). 

If the new situation does not match patterns stored in memory, then an “anomaly” 
is perceived. In the face of an anomaly, the decision maker attempts to develop and test 
hypotheses that explain or make sense of the anomalous situation. New explanations 
may be constructed by combining knowledge from several domains or by using rules or 
analogies to adapt current knowledge to fit the new situation. This “additive explana- 
tion” activity is important in Schank’s theory because it is the creative adjustment of 
existing knowledge that results in learning. If an explanation cannot be created, the in- 


2 One exception is Kinney’s (1987) simulation of the relative ability of a rule-based pattern analysis scheme 
to signal errors. While Kinney’s study provided evidence of the usefulness of ponam analysis, it was not a study 
of auditor judgment. 
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dividual would generate a “canned” explanation that might be very general and readily 
available from a similar knowledge structure (Schank 1986, 27-29). 


Related Research in Auditing 


Previous studies on the cognitive aspects of analytical procedures follow two basic 
strategies. One is that of Biggs et al. (1989, hereafter referred to as BMW), who used ver- 
bal protocol analysis to obtain broad, descriptive evidence of auditors’ reasoning pro- 
cesses in analytical procedures and of the link between analytical procedures judg- 
ments and audit program decisions. Although BMW’s case materials were extensive, 
only one known problem area was included, allowing limited evidence to assess audi- 
- tor performance. BMW present a general process model of analytical procedures in- 
volving information acquisition, evaluation, combining evaluated information to make 
a choice, and feedback. This model implies, but does not specifically incorporate, pat- 
tern recognition and hypothesis generation, which are viewed here as components of 
combining evaluated information. Therefore, this study extends BMW by specifically 
investigating pattern recognition and hypothesis generation. A further extension is the 
presence of an outcome criterion, which enables evaluation of auditors’ reasoning pro- 
cesses as they influence performance. 

A second strategy for study of the cognitive aspects of analytical procedures is 
represented by Libby (1985) and Libby and Frederick (1990). Their research used hy- 
potheses generated to provide evidence that auditors organize knowledge around audit 
cycles. This research extends their approach by investigating the use of knowledge re- 
flected in auditors’ prehypothesis reasoning and in the hypotheses generated. The im- 
portance of studying processes of knowledge use is stressed by Johnson et al. (1982, 202; 
emphasis added): 

A given clinical judgment requires not only specific kinds of medical knowledge, but 

also processes or strategies for using this knowledge to consider possible disease hypoth- 


eses and the potential relationships existing between diseases and patterns of cues in the . 
- patient signs and symptoms. 


Research Questions 


This study uses a case with an outcome criterion and verbal protocol analysis to 
gain understanding of how the processes of pattern recognition and hypothesis genera- 
tion affect performance. Since little evidence on this topic exists in the auditing litera- 
ture, our research questions are exploratory in nature: 


1. Are auditors able to generate the correct hypothesis? 

2. What processes lead to correct performance? 

3. If the correct hypothesis was not selected, was the patior of discrepancies rec- 
ognized? 

4. What specific processes of hypothesis generation inhibited subjects from ex- 
plaining the recognized pattern? 


I. Methods 
Task Development and Validation 


Task development was guided by three goals. First, the task should have a criterion 
outcome. Second, it should be taken from practice in order to enhance external validity. 
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Third, it should be reasonably complex, in order to generate significant prehypothesis 
reasoning from subjects. To meet these goals, we worked with a senior manager at a Big 
Six firm to identify a suitable task from his firm’s practice. A situation involving an 
error in the allocation of overhead to inventory was selected. Considerable over-budget 
audit activity (recounting of inventory and additional compliance tests) could have 
been avoided had the audit focus been narrowed through analytical procedures. 

The case was developed by establishing correct balances and then seeding the mis- 
allocated overhead. The senior manager who participated in identifying the audit 
problem read the case to ensure that it was consistent with the firm’s experience. The 
task was then revised several times based on comments by a partner at a second Big Six 
firm to ensure that: (1) the case was not firm-specific; (2) it was reasonably realistic; and 
(3) subjects would not need other materials to solve it within a reasonable period (ap- 
proximately one hour). Pilot testing was performed by two auditors in one of the Big Six 
firms, two doctoral students, and an auditing professor, all of whom have public ac- 
counting experience. Table 1 shows the resulting task (percent changes included in the 
table were not provided to subjects). 

Financial information was presented as “projected” and “unaudited,” differing 
only by the presence of the seeded error. Projections were described as summaries of 
expectations based on results from previous quarters of the current year, past audited 
results, and industry trends. Unaudited values were the client’s representations of end- 
of-year balances. Subjects were told that the projections were prepared by the auditing 
firm and could be considered reliable. In that way, their attention was focused on the 
client’s numbers as the source of differences, not on accuracy of projection. : 

Subjects were told to assume that only one error or misrepresentation caused the 
discrepancies in the case and were asked to give their “most likely hypothesis” (MLH) 
about an error or misrepresentation that could have caused the observed discrepancies. 
The “most likely” form of response is consistent with the Einhorn and Hogarth view 
(1986, 4) that causal reasoning usually involves uncertainty. When performing the case, 
subjects assumed a supervisory role on a continuing audit, advising the senior regard- 
ing a series of discrepancies that the senior had discovered. Included in the case was 
management’s representation of the cause of the discrepancies (a large year-end pur- 
chase at favorable prices), which explains some, but not all, of the discrepancies seen. 

The premise underlying the case construction was that a subject could not identify 
the error that caused the discrepancies unless those discrepancies were considered as a 
pattern. There were four crucial discrepancies to be reconciled: increases in inven- 
tory and income (either net income or income before taxes), decrease in gross mar- 
gin, and no change in sales. If these discrepancies are considered as one pattern to be 
explained, then the resulting conclusion is that some part of SG&A expense was capi- 
talized to inventory (and under a FIFO system was partially expensed in cost of sales).° 


Subjects 


Subjects were provided by two New England offices of Big Six firms. The senior 
manager and partner, previously mentioned as being involved with case development 


7 This pattern is consistent with an error or misrepresentation in allocating overhead (or other costs) to 
inventory. So far as we have been able to determine, no other error can account for the discrepancies, How- 
ever, in analyzing protocols we were careful to ensure that subjects’ reasoning processes did not include as- 
sumptions or other reasons that could allow any other hypothesized error to be correct. 
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Table 1 
. Financial Information Included in the Case 


1987 Projected 1987 Unaudited % Change 
Ratios | 
CR Current Ratio= 
Current Assets 1.50 1.52 +1.3 
Current Liabilities 
QR Quick Ratio= 
Current Assets Less on 
Inventory 1.00 0.88 <2.0> 
Current Liabilities 
GM Gross Margin % 28.7% 27.8% <3.1> 
IBT Income Before Taxes 
as % of Sales 7.5% 10.0% +33.3 
ITO Inventory Turnover = 
Cost of Goods Sold 2.85 2.62 <8.1> 
Ending Inventory 
RTO Receivable Turnover = 
Sales 3.00 3.00 — 
Ending Receivables 
Balances 
INV Inventory (FIFO), 
12/31/87 l $1,500,000 $1,650,000 +10.0 
CA Current Assets, 
12/31/87 4,496,000 4,646,000 +3.3 
NI Net Income, 1987 270,000 360,000 +33.3 
(Assume a marginal tax 
l rate of 40%) 
SA Sales, 1887 6,000,000 6,000,000 — 


Cue abbreviations and percent changes were not included in case information, but are shown here for 
descriptive purposes. 


and validation, selected subjects to perform the task. Being familiar with the case, they 
were able to select subjects with relevant background experience. In order to obtain a 
_ range of performance, 12 seniors and ten managers were selected. Subjects were guar- 
anteed anonymity by the researchers and were only identified by a code number on the 
tapes and written materials. From the 22 subjects who completed the task, 21 usable 
responses were obtained, including ten managers and 11 seniors.* 


Procedures for Data Collection and. Analysis 


Each session took place in a conference room or office at the subject’s workplace. 
As they worked through case materials to develop their MLH, subjects’ responses were 


$ One of the tapes could not be transcribed since the verbalizations were inaudible. 
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collected in the form of think-aloud verbal protocols. In accordance with procedures 
recommended by Ericsson and Simon (1984), the instructions to think aloud were gen- 
eral requests to say everything that comes to mind as the task was performed. The re- 
searchers were present to operate the audio recorder and to remind the subjects to 
think aloud if they fell silent for more than a few seconds. In order to become comfort- 
able with the task and think-aloud procedures, subjects were asked to complete a simple 
analytical procedures task before beginning the research task. The audio tapes were 
transcribed and analyzed later by the researchers. Protocols were coded independently 
by two individuals (generally one author and a doctoral student), agreement statistics 
computed, and differences reconciled by the other author. Specific coding rules and 
agreement are presented with the relevant results (Kappa coefficients are reported only 
when agreement is below 90 percent). 


Ill. Results and Discussion 


Results are presented in three sections: classification of task performance, reason- 
ing processes of correct subjects, and reasoning processes of incorrect subjects. 


Classification of Task Performance 


The first research question addresses performance quality. Each subject’s MLH 
was evaluated by determining whether it would have caused all discrepant cues to vary 
from projected in the same direction as found in the case. If the MLH would have 
caused all cues to vary similarly, it was deemed “Correct”; otherwise, it was 
“Incorrect.”° Table 2 presents the subjects’ MLHs, and the results of the Correct /In- 
correct classification. Six of the 21 auditors proposed correct MLHs, and five of those 
were managers.’ The remaining MLHs were incorrect in different ways. Eight MLHs 
were specific errors that did not correctly explain all discrepancies. Four were too gen- 
eral to be correct De, specific discrepancies caused could not be determined), and 
three simply identified a discrepancy in a single account rather than a specific error. 


Reasoning Processes of Correct Subjects 


The second research question addresses the issue of how Correct subjects per- 
formed the task. Of the six Correct subjects, four (M1, M5, M6, and M10) performed the 
task efficiently Oe, they encountered few difficulties and completed the task in an 
average of about 50 lines of protocol, compared to 226 and 181 lines for M3 and S10, re- 
spectively). A flowchart of M5’s decision process is analyzed as representative of these 
four subjects. The other two Correct subjects (M3 and $10) initially encountered diffi- 
culties that had to be overcome before they successfully completed the task. To illus- 


* In the practice case, cues were: an increase in the current ratio, an increase in earnings per share, and no 
e in current assets. Subjects were told that there could be a number of specific causes of these discrepan- 
cies and that they should choose one as “most likely.” 

* Coding of the MLH was applied by one of Kë authors and a doctoral student with extensive public ac- 
counting experience. Agreement between the two coders was 95 percent. After measuring agreement, coding 
differences were reconciled by the second author. 

7 Subjects’ protocols were examined for assumptions regarding factors such as materiality that might have 
affected their outcomes. This resulted in a “Correct’’ classification for M6, who made an assumption that the 
Ges in gross margin percentage was not material, so that discrepancy did not have to be explained in his 
or her answer. 
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Table 2 
Classification of Subjects’ Most Likely Hypotheses 


Typical Wording of MLH 
Correct/Spectfic 
M1, M3, M5, Overhead allocation error (Capitalized inventory costs that should have been ex- 
M6, M10, $10 pensed below the gross margin line) 
Incorrect/Specific 
M7, SA Failure to record SG&A expense and liability 
M2 Large year-end purchase not accrued (cut-off error) 
M4, 87 Costs capitalized to inventory; not going through cost of sales 
M8 Recorded sales in a certain period but didn’t relleve the inventory 
Si Sale not recorded properly {created sales in COGS) 
S5 Sales not recorded 
Incorrect/General 
M9, S2, S11 Errors in pricing/valuation of inventory 
S8 Poor inventory management 
Incorrect/Account 
83, S8 Inventory overstated 
s9 Cost of sales overstated 


Correct= MLH explains all discrepancies between projected and actual values of cues given in the case. 
Incorrect: 
Specific =MLH identifies accounts debited and credited; does not address at least one discrepancy, 
or ẹxplains at least one discrepancy in the direction opposite to that given in the case. 
General= MLH identifies a general category of error rather than specific accounta. 
Account=MLH identifies a single account (cue) to be investigated. 


trate one such case, we discuss separately the means used by M3 to resolve his or her 
difficulties. 

Since the decision processes of the four “efficient” subjects were similar, M5’s pro- 
tocol is used as a representative example of how these Correct subjects performed the 
task. Figure 1 presents a flowchart of M5’s decision process, in which cues are repre- 
sented with circles and inferences are represented with rectangles. Panel A of the 
figure shows behavior during acquisition and evaluation of cues, while panel B shows 
the subject combining crucial cues into a pattern and generating an MLH that explains 
them. Triangles denote inferences developed during acquisition that carry over into the 
integration phase (e.g., “A” denotes inventory has increased relative to projection). 
Numbers in the left margin refer to the approximate order in which the process steps 
occurred in the protocol. 

There are several important aspects of M5’s decision process that illustrate how the 
Correct managers solved the case. First, by the end of panel A, M5 had acquired and "` 
correctly interpreted all crucial cues. Second, panel A shows M5 making inferences 
from combinations of discrepancies. For instance, the boxes linked together in lines 3 
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Figure 1 
Flowchart of M5’s Decision Process (Correct MLH) 


Panel A. Information Acquisition and Initial Evaluation 
(See table 1 for abbreviations within circles): 
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Figure 1—Continued 


Panel B. Pattern Recognition and Hypothesis Generation 
(See table 1 for abbreviations within circles): 


ll. 


13, 





14. 


15 SALES AS LOOK AT GA WHY DID THEY 
E EXPENSES DECREASE SO 
` MKH : 


16. 





ANTICIPATED 
UP RELATIVE 
TO ANYTHING 
l ELSE 
ek 6 
inferences carried over from panel A to panel B (denoted by triangles): 
A=Inventory is increased 
B=SG&A expense is decreased 


C= Income is increased 
D=Sales are unchanged 


and 4 show M5 inferring from discrepancies in the gross margin and income before 
taxes that two explanations are possible. One alternative (regarding a possible increase 
in sales) is subsequently disconfirmed in line 6. The other (SG&A expenses decreased) 
seems to be retained in memory and is not mentioned again until line 14. Third, in 
panel B, the fact that all crucial cues are linked to a common inference indicates that 
M5 recognized them as a coherent pattern. Fourth, after reiterating the inference that 
the discrepancies point to problems in SG&A expenses, M5 readily proposes a misallo- 
cation in costs between SG&A and inventory as the MLH, in line 15. 
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M65’s decision process is illustrative of how the four efficient managers were able to 
integrate all relevant discrepancies into a coherent pattern and to access a relevant ac- 
counting error in memory that explains the pattern. The flowchart and the underlying 
protocols give no indication that M5 perceived an anomaly in the pattern (Schank 
1986), but rather indicate that the pattern was interpreted simply on the basis of knowl- 
edge of accounting principles and the effects of error on the accounts. 

While M5’s decision process flows smoothly from acquisition to MLH proposal, 
M3 failed to acquire the increase in income cue until line 203 of the protocol. This 
acquisition problem caused M3 to formulate the problem as involving only inventory, 
cost of sales, and current liabilities. But, while evaluating that representation of the 
problem by constructing a trial income statement {about line 202), M3 discovered that 
the change in income had not been acquired. Once this cue was acquired, M3 success- 
fully completed the task in just over 20 additional lines of protocol. Recovery from the 
initial acquisition error resulted from the effort to test the coherence of a hypothesis, 
using the structure of income statement relationships. Testing coherence of hypotheses 
is important in theory (e.g., Einhorn and Hogarth 1987, quoted earlier, and Schank’s 
1986 process of sense-making and hypothesis modification) and has been found to be 
associated with good performance in other domains. For example, Robertson (1990) 
found that performance was associated with use of physics relationships to structure 
problem-solving methods. 


Reasoning Processes of Incorrect Subjects 


Schank (1986) proposes that reasoning processes can be best understood by study- 
ing instances in which the decision makers have difficulty constructing explanations. 
Our primary objective in analyzing reasoning of Incorrect subjects was to determine 
whether problems preventing a correct solution occurred at the pattern recognition or 
hypothesis generation stage. However, several subjects had acquisition errors that pre- 
vented recognition of the pattern; these are discussed separately prior to addressing the 
research questions. The remaining subjects were classified as having made pattern 
recognition or hypothesis generation errors.® Classification results are presented in 
figure 2 as a taxonomy of errors. Each step is identified by a question; a “no” answer 
indicates that an error has been made at that step. The taxonomy is progressive in that a 
subject who made an error at one step successfully performed the previous steps. 

ACQUISITION ERROR. Selective perception can influence acquisition and inter- 
pretation of information (e.g., Hogarth 1987), and both acquisition and correct interpre- 
tation are necessary to enable recognizing a pattern of cues.’ The process trace allowed 
evaluation of whether an Incorrect subject’s error was due to an acquisition or an inter- 
pretation problem. A subject was designated as having made an acquisition error if 
mention of one of the crucial cues was not present in the protocol. An interpretation 
error was made if by misreading a cue the subject was prevented from correctly formu- 
lating the problem. 


® There was 90 percent agreement between the authors on classification of subjects within the taxonomy of 
errors. 

° M3 (discussed earlier as having corrected an acquisition problem) illustrates the enabling nature of good 
acquisition behavior in this task. Considerable processing without the income cue was unsuccessful, but after 
acquiring it M3 was able to generate the correct hypothesis very quickly. 
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Figure 2 
Taxonomy of Reasoning Errors 
Did subject acquire and correctly 
interpret all cues? ————> NO: ACQUISITION ERROR 
| (S1, S57, M9) 
YES 
Did subject recognize pattern 
that needs to be explained? =+ NO: PATTERN RECOGNITION ERROR 
| l . (S2, S6, M4, M8} 
YES 
Did subject explain all parts of 
pattern correctly? —_————> NO: HYPOTHESIS GENERATION ERROR 
| (S3, S4, S5, S8, S9, $11, Mz, M7) 
YES 


| 


CORRECT; 
(M1, M3, M5, M6, M10, S10) 


Three subjects were found to have acquisition or interpretation errors. M9 did not 
acquire either cue representing the discrepancy in income, and S1 and S7 made errors 
in interpretation. For example, S1 used the value of income before taxes to sales as the 
value of gross margin. This error led to the subject’s conclusion that there were no ex- 
penses other than cost of sales, preventing an understanding of the actual source of 
additional costs capitalized to inventory. Since acquisition is not the primary focus of 
this research, subjects not meeting this enabling condition were omitted from further 
process analysis. 

PATTERN RECOGNITION ERROR. The third research question addresses the 
issue of pattern recognition. Subjects were designated as having made a pattern recog- 
nition error if they did not use all crucial cues in combination. Recognizing the crucial 
cues as a coherent pattern was an essential step in correctly performing the task. In 
order to determine whether subjects were attempting to view those cues in combina- 
tion, the protocols were examined for instances in which they were part of a common 
inference, or were cross referenced. 

Of the 18 subjects who acquired and interpreted all cues correctly, only four (S2, S6, 
M4, and M8) did not attempt to use all the crucial cues in combination. A representa- 
tive example of a pattern recognition error is the protocol of M8, shown in figure 3. In 
contrast to M5 (the Correct subject shown in figure 1), M8’s flowchart has fewer con- 
nections between cues. In panel B, M8 uses only the increases in inventory and current 
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Figure 3 
Flowchart of M8’s Decision Process (Incorrect MLH) 
Panel A. Information Acquisition and Initial Evaluation: 
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Figure 3—Continued 
Panel B. Pattern Recognition and Hypothesis Generation: 


11. 





12. 

13, 

14. 

15. FOCUS ON MOST OF THAT 

WATS IN [7] I IMAGINE, Is 
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16. , HAD CUTŒF 

PROBLEMS 





Inferences carried over from panel A to panel B (denoted by triangles): 
A=Gross margin is decreased 

B=Current assets are increased 

C= Inventory turnover is decreased 


assets in formulating the MLH. Income statement cues were not used. This can be con- 
trasted with panel B for M5, in which all crucial cues were combined to generate the 
MLH. M8’s MLH (failure to relieve inventory) was a specific error that could have been 
disconfirmed by reference to the previous inference that cost of sales was increased 
(line 6). 

Subjects who made pattern recognition errors tended to reason with one or two cues 
at a time instead of attempting to reason with the combination of all crucial cues. The 
basis for recognizing these cues as a pattern is knowledge of accounting principles 
(specifically the double-entry system and cost allocation) that determine which cues 
should be combined. Since there is little doubt that these auditors would be aware of 
such principles, they apparently had difficulty in accessing this knowledge while per- 
forming the task. One explanation for the failure to access knowledge needed to recog- 
nize the pattern is that these subjects focused on “surface” features of the task (Chi et 
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al. 1982), which prevented them from combining cues based on more complex underly- 
ing principles or relationships (Qin and Simon 1990). These subjects’ lack of pattern 
recognition seems plausible given Kinney’s (1987) conclusion that analytical pro- 
cedures as performed in practice do not stress the use of data in combination. 

HYPOTHESIS GENERATION ERROR. The fourth research question concerns pro- 
cesses of hypothesis generation that inhibited subjects from explaining the recognized 
pattern. Subjects who recognized the pattern but did not propose a correct MLH were 
designated as having made hypothesis generation errors. Since the verbal protocols 
provide a trace of how hypotheses are generated and evaluated throughout the process 
leading up to an MLH, we can detect differences in behavior at this stage that are 
associated with differences in performance. In particular, evidence of how knowedge 
is used in solving the problem is represented in both the type of error proposed (e.g., 
Libby 1985; Libby and Frederick 1990) and the focus or target discrepancies explained 
in the hypothesis (e.g., Johnson et al. 1991). To obtain this evidence, hypotheses pro- 
posed after pattern recognition were first identified.'° Second, these hypotheses were 
coded in a typology based on that of Coakley and Loebbecke (1985, 227) to determine 
the type of error represented. The major types of accounting errors in the typology are: 
Cutoff errors (an entry is made in the wrong period); Classification errors (an amount is 
placed incorrectly within a financial statement, e.g., a current asset is classified as non- 
current); Valuation errors (an entry is included at an inappropriate amount); Recording 
errors (an entry is made to the wrong account); and other errors (which we have termed 
Nondescript). Third, in order to determine which parts of the pattern were being ex- 
plained, hypotheses were coded as to their primary focus: INV (primarily addressing in- 
creases in inventory/cost of sales), SGA (primarily addressing decreases in other ex- 
penses), or BOTH." Results are shown in table 3.7 

Table 3 shows that of the 14 subjects who recognized the pattern, six of seven se- 
niors made hypothesis generation errors, compared to only two of seven managers. 
This difference between managers and seniors is significant (Fisher Exact Test, 
p<0.05) and is the only step at which a difference between the two ranks represented in 
our subject group is apparent." This result, and the process differences reported below, 
provide preliminary evidence of a general experience difference at this stage. In the fol- 
lowing analysis, we first compare the hypothesis generation performance of the Incor- 
rect and Correct subjects and then analyze the sequence of hypotheses generated by the 
Incorrect subjects. 


10 Tn addition to specific accounting errors, a statement of a “problem” in the account was also termed a 
hypothesis for this analysis. The rationale supporting inclusion of these problem statements as hypotheses is 
that auditors often modify subsequent testing when they think an error has occurred, even if the exact nature of 
the error has not been determined. In fact, several MLHs were not specific errors (see table 2}. Hypotheses were 
identified independently by one of the authors and the doctoral student referred to in footnote 6. Agreement 
between the codings was 97 percent. 

1 Error types and focus were coded independently by the authors. Agreement between the codings of error 
type was 87 percent (kappa = 0.823, p<0.0001). Agreement on focus of the errors was 94 percent. 

2 Table 3 contains only hypotheses proposed after the point at which the subject recognized the pattern, in 
order to ensure that all subjects had access to the same pattern prior to the time the hypotheses were proposed. 
For that reason, M6 does not appear with Correct subjects, since due to his or her materiality assumption the 
pattern recognized was different (see fn. 7). 

4 The better performance of managers is consistent with results of Biggs and Mock (1983), who showed 
that managers were better able to organize diverse pieces of incoming information into meaningful chunks. 
The process data given in the following section reinforces the view that managers were better able to maintain 
and access more complete patterns for subsequent processing. 
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Table 3 
Hypotheses Generated by Subjects Who Recognized the Pattern (see figure 2) 


Panel A. Sequence of Hypotheses, Described by Error Type“ and Focus*: 














Correct 
M1 M3 M5 M10 S10 
1. REC/BOTH 1. REC/BOTH 1. REC/AINV ` 1. REC/SGA - 1. CLA/BOTH 
2. REC/BOTH 2. NON/ISGA 2. REC/BOTH 2. NON/BOTH 
3. REC/BOTH 3. REC/BOTH 3. REC/BOTH 
4. REC/BOTH , 
5. REC/BOTH 
6. NON/BOTH 
7. REC/BOTH 
Incorrect . 
Managers : Seniors: Focus on INV Seniors: Focus on INV and/or SGA 
M2 S5 S8 $3 S11 
1. CUT/INV 1. RECAINV 1. NON/INV 1. NON/SGA 1. NON/INV 
2. VALIINV 2. REC/AINV 2. VAL/INV 2. REC/INV 2. NON/INV 
3. VALJINV 3. REC/INV 3. NON/INV 3. NON/INV 3. NON/INV 
4, REC/BOTH Sg 4. NON/INV 4, NON/INV 4. NON/INV 
5. CUTIINV 5. NON/INV 5. REC/INV 5. NON/BOTH 
6. CUT/INV 1. NONIINV 6. VALIINV 6. NON/INV 6. VAL/INV 
7. CUTIINV i 7. NON/INV 7. VALIINV 
2. NON/INV 
8. CUTIINV a NON/INV S4 8. VAL/INV 
9. NON/SGA 4 NON/INV EE 9. VAL/IBOTH 
10. NON/JINV R 1. VALJINV 10. VAL/INV 
11. CUT/INV i 
2. VAL/JINV 
M7 3. VALAINV 
EE 4. VALJINV 
1. REC/SGA 5. REC/SGA 
Panel B. Summary Statistics of Error Type and Focus: 
Error Focus 
INV SGA BOTH TOTAL 
Correct 1 2 13 16 
Incorrect 40 4 3 47 
Total 41 6 16 63 
Kaes eesc? — — 
Error Type 
REC VAL CUT CLA NON TOTAL 
Corract 12 0 0 1 3 16 
Incorrect 8 13 6 0 20 47 
Total 20 13 8 1 23 


83 


LAI 





a CLA =Classification; CUT =Cutoff; REC = Recording: VAL=Vahiation; NON = Nondescript 
* INV =Focus of explanation on Inventory; SGA = Focus on SGA; BOTH = Focus on both sides 
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Comparison of Incorrect and Correct Subjects. Panel B of table 3 reveals differences 
between Correct and Incorrect subjects in both type and focus of hypotheses generated. 
In terms of the types of errors represented in hypotheses, Recording errors (the correct 
type) predominated for Correct subjects, while the most frequently proposed type for 
Incorrect subjects was Nondescript. Collapsing the Classification, Cutoff, and Valua- 
tion into one category results in a significant difference in error type between Correct 
and Incorrect (chi-squared = 18.8, p<0.001)}. Since Nondescript errors were often state- 
ments of a single account discrepancy (e.g., “inventory overstated”), their increased 
use by Incorrect subjects suggests a simple or surface approach to the problem. 

The Correct subjects also showed a significantly greater tendency to propose hy- 
potheses focusing on the entire pattern (BOTH), while Incorrect subjects tended to ex- 
plain one side, particularly INV (chi-squared =31.47, p<0.001). Focusing attention on 
the complete pattern is cognitively more demanding than addressing only part, since 
integration of a greater number of cues is required. Although the entire pattern had 
been recognized earlier in the protocol of all these subjects, some were unable to access 
or develop hypotheses that could explain more than part of it. Similarly, in the con- 
text of concurring partner review, Johnson et al. (1991) observed that successful sub- 
jects were able to keep more than one “‘line of reasoning” (explanation focus) in mind. 

Sequence of Hypothesis Generation by Incorrect Subjects. In their effort to propose a 
hypothesis matching the pattern, most subjects generated and evaluated a series of hy- 
potheses, finally selecting one as MLH. Panel A of table 3 shows the sequence of hy- 
potheses generated by Incorrect subjects, indicating the step-by-step use of knowledge 
in construction of explanations for the pattern. In contrast to the Correct subjects who 
either began with the correct error type or began attempting to explain both sides, In- 
correct subjects exhibited a variety of behaviors. For purposes of discussion, they can 
be divided into three groups. 

First, seniors $5, S8, and S9 did not attempt to explain income statement discrep- 
ancies below the gross margin line (SGA). Protocol evidence shows that these subjects 
saw the pattern of income statement discrepancies as anomalous, consistent with 
Schank’s (1986) theory of explanation. All these subjects focused on explaining inven- 
tory discrepancies. They also show a tendency to fix on a particular error type, al- 
though the types differ across individuals. One explanation for these behaviors is that 
accessing an error for a particular type or focus in memory makes subsequent access to 
related errors easier. Similar behavior has been observed by Libby (1985), who found 
evidence that the proposal of a hypothesis from a particular accounting cycle was sig- 
nificantly associated with proposal of further hypotheses from the same cycle. If ini- 
tially proposed hypotheses are of the correct type, further processing will be efficient 
since subsequent activation will be of similar errors (Libby 1985; Libby and Frederick 
1990). However, early proposal of an inappropriate hypothesis may be difficult to re- 
verse since the correct type may not be easily generated. 

A second group is composed of the remaining seniors who made hypothesis genera- 
tion errors (S3, S4, and S11). These subjects made more progress, since all were able to 
consider both INV and SGA sides of the pattern. S3 and S4 explained each side 
separately in different hypotheses but were not able to propose an error integrating the 
two. $11, although primarily concerned with explaining inventory, did propose several 
errors that addressed both sides. These subjects attempted to provide integrative, 
additive explanations (Schank 1986), but their lack of success indicates that they were 
unable to creatively form linkages between the two sides. 
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The two managers who made hypothesis generation errors show different behavior 
from the two groups discussed above. M7 proposed only a single hypothesis, which was 
the correct error type but did not address INV discrepancies. M2, in contrast to the 
seniors discussed above, proposed a variety of error types. However, early proposal of a 
cutoff error {at line 4 of the protocol) apparently affected ability to evaluate other hy- 
potheses in an unbiased manner. In fact, M2 was the only Incorrect subject to propose 
the correct hypothesis (his or her fourth), but be or she made little effort to confirm or ~ 
disconfirm that hypothesis. Instead, M2 returned to proposing cutoff errors, which 
could have been disconfirmed by evaluation of cues already acquired. Similarly, diag- 
nostic performance of incorrect physicians in Patel and Groen (1986) was affected by 
inability to disconfirm early hypotheses. 


IV. Conclusions and Limitations 


This study examines auditor performance in analytical procedures and the relation- 
ship of performance to two important steps in the analytical procedures process: recog- 
nition of patterns of discrepancies from projected balances and generation of hypoth- 
eses that fit recognized patterns. Following discussion of limitations, we offer several 
issues worthy of further research that are suggested by our results, 


Limitations 


There are several limitations to the results presented above. First, the external 
validity of this study is limited since the case contains less information than the real 
audit environment. Second, consultation with other members of an audit team was not 
available. To partially compensate for task limitations, a partner and a senior manager 
of the firms involved in the study assisted in development of the case, and their com- 
ments were incorporated to improve consistency with practice. Third, it is important to 
note that, although verbal protocol procedures used were designed to yield valid data, 
the resulting protocols may not be a complete trace of all thoughts during a decision 
process (Ericsson and Simon 1984). A recent study by Russo et al. (1989) suggests that, 
‘depending on the nature of the task, concurrent verbal protocols may have an effect on 
the accuracy of subjects’ performance. In the tasks studied by Russo et al. that were 
most similar to ours, taking protocols either had no effect or improved accuracy. Thus, 
even though Russo et al. recommend caution in interpreting results of concurrent pro- 
tocols, it may be that they produce enhanced accuracy in comparison with task perfor- 
mance in a nonverbalization condition. Also, while protocol analysis involves some 
subjectivity, the use of multiple coders provided measures of agreement to give evi- 
dence of coding reliability. 


Conclusions > 


The first issue suggested by our results concerns how pattern recognition and hy- 
pothesis generation interact to determine auditor performance. We found that hypoth- 
esis generation inhibited performance more frequently than did pattern recognition. 
This issue deserves further research because in practice, these two processes may inter- 
act in a variety of ways due to features of the analytical procedures situation. For exam- 
ple, the number of errors present and their perceived base rates may affect auditors’ 
ability to recognize patterns or explain them. Further study of the relative difficulty of 
these two processes in different audit situations is needed to fully understand their rela- 
tionship. ' 
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The second issue involves understanding the creative process of generating causal 
hypotheses. We found that the decision processes of most Correct subjects resulted in 
an efficient proposal of an MLH. Of those who did not quickly find a match for the rec- 
ognized pattern, few were able to construct the correct solution even with an extensive 
decision process. Creative construction of explanations as described in Schank’s (1986) 
theory would help auditors resolve anomalous situations. For example, is personal ex- 
perience with similar situations necessary to resolve anomalies or can training and/or 
decision assistance enable effective transfer of experience across tasks or among indi- 
viduals? 

A third issue concerns the effectiveness of auditors’ hypothesis evaluation be- 
havior. Our results indicate that a number of subjects had difficulty in evaluating hy- 
potheses. For example, M2, who at one point proposed the correct hypothesis, was 
unable to disconfirm an early hypothesis. This auditor and seven others did not discon- 
firm their self-generated hypotheses, even though disconfirming cues had already been 
acquired. Similarly, Heiman (1990) found that auditors do not automatically generate 
alternative hypotheses when performing analytical procedures and concluded that 
audit effectiveness could be compromised if auditors overestimate the likelihood of a 
hypothesized cause."* In contrast, Kida (1984) and Anderson and Kida (1989) found that 
auditors attended more to disconfirming than confirming information when evaluating 
inherited hypotheses. Since these studies differ in design and in the audit decision con- 
text studied, further research that attempts to reconcile these divergent views of audi- 
tors’ use of disconfirming information is called for. 

Fourth, this study touches on a number of aspects of an important general issue in 
audit planning: the relative emphasis on analytical procedures versus other substantive 
tests. Our results indicate that the hypotheses generated by a number of subjects show 
insufficient use of analytical procedures, in that better use of the information provided 
could have narrowed the focus of subsequent testing, thus improving audit efficiency. 
This result is in line with Kinney’s (1987) proposal that insufficient analytical proce- 
dures may result in reduced audit quality since other substantive tests may be reduced 
due to reliance on seemingly satisfactory analytical results. However, in practice, there 
are many factors that affect the decision to move from analytics to other substantive 
procedures, including variance in individual comfort levels, the extent of use of stan- 
dard audit programs, etc. Given the importance of this issue, research is needed to 
address the costs and benefits of analytical procedures versus other audit tests. 


14 In the context of medical diagnosis, Johnson et al. (forthcoming) observed “garden path errors,” in which 
expert physicians ignored or explained away information conflicting with currently held hypotheses. 
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THE ACCOUNTING REVIEW 


Incidence and Circumstances of 
Accounting Errors ` 


. Mark L. DeFond 
University of Southern California 
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University of Washington 


SYNOPSIS: Accounting Principles Board (APB) Statement No. 20 (1971) 
defines financial statement errors as items resulting “from mathematical 
mistakes, mistakes in the application of accounting principles, or the over- 
sight or misuse of facts that existed at the time the financial statements 
_ were prepared” (APB 20, par. 13). This definition encompasses both 
intentional and unintentional misrepresentation by management. Errors af- 
fecting previously reported earnings are revealed as prior period adjust- 
ments as specified in Statement of Financial Accounting Standards No. 16 
(1977). f the erroneous year’s financial statements are presented, retro- 
active restatement is required {in accordance with APB 9 1966). Footnote 
disclosure of the nature of the error and its effect on earnings, earnings 
before extraordinary items, and earnings per share is also required. 
Although financial statement disclosures generally provide no indica- 
tion that prior errors were intentional, they may be motivated by the same 
types of economic incentives influencing managers’ choices of accounting 
methods or management of accruals.' In this study, we examine the inci- 
dence of accounting errors revealed by prior period adjustments for 41 
firms in comparison with a control group of another 41 firms. This compari- 
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son is used to highlight circumstances that are likely to motivate managers 
to use errors as an income management tool.” 

While corrections of prior year earnings are rare for both over- and 
understatement errors, the latter are relatively less frequent. Our inves- 
tigation revealed 41 overstatement firms but only three understatement 

firms, which is consistent with an income-increasing motivation.’ Because 
of the very small number of understatements, the analysis is limited to 
overstatement errors. We find that the earnings overstatements are nega- 
tively correlated with the growth in earnings. Analysis also indicates that 
earnings overstatements are more likely when firms have diffuse 
' ownership, lower growth in earnings, and fewer income-increasing GAAP 
alternatives available. Overstatements are less likely among firms that have 
audit committees. These résults are generally consistent with the view that 
overstatement errors are the result of managers responding to economic 
incentives. 


Key Words: Error, Overstatement, Choice, Incentive. 


| Data Availability: The data used in this study are available from Mark 
| DeFond. 


HE remainder of this paper consists of four sections. In the next section, incen- 
centives to overstate earnings are discussed. Section II presents sample selection 
procedures and our approach to data analysis. Section III presents an analysis of 
the sample firms, and section IV prowess a summary of the results. 


I. inpeiitives to Overstate Earnings 


, Previous studies have tested hypotheses that managers choose GAAP that maxi- 
mize their wealth, which is a function of accounting-based contracting variables. Con- 
tracting incentives generally motivate selection of accounting procedures that enhance 

. We make use of this literature by designing tests to determine if similar incen- 
tives motivate managers to increase reported earnings via practices that violate GAAP 
(namely, overstating earnings). ` 


2 A recent study by Kinney and McDaniel (1989) examined correction of errors in previously issued quar- 
terly reports. They find that the average stock return between the issuance of misstated quarterly reports and 
their correction in annual reports is negative. They also provide descriptive information indicating that the 
majority of restatements “are made for firms listed over-the-counter and the typical firm is smaller, less profit- 
able, more highly leveraged, slower growing, and received more uncertainty qualified opinions than others in 
the same industry” (p. 25). While the incentives to misstate annual financial statements (the statements of inter- 
est in our study) and controls in place to detect such misstatements may differ from the incentives and controls 
related to errors in interim (quarterly) reports, their results are generally consistent with our contention that 
érrors may be the result of management responding to incentives to enhance 

' 3 Firms were identified by a search of the National Automated Accounting Research System data base (con- 
ducted $ in September 1989) and by examination of Accounting Trends and Techniques for the years 1977 
1988. 
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Compensation and Incentives to Overstate Earnings 


Based on the work of Watts and Zimmerman (1978), Dhaliwal et al. (1982) argue that 
firms with diffuse ownership are more likely to use accounting-based bonus compensa- 
tion schemes to influence managers to act in the best interests of outside owners. Such 
compensation schemes may also motivate managers to select accounting methods that 
increase reported earnings. Consistent with this conjecture, they found that firms with 
diffuse ownership tend to choose depreciation methods that increase earnings. Al- 
though the costs and benefits of intentional overstatement of earnings are different 
from the costs and benefits associated with increasing earnings using a generally ac- 
cepted accounting method, managers of firms with diffuse ownership may be more 
likely to overstate earnings. 

As in Dhaliwal et al. (1982}, we define a firm as having diffuse ownership if no one 
_ party controls five percent or more of the outstanding common stock. They classified 
firms as having concentrated ownership (so-called owner-controlled firms) if one party 
owned ten percent or more of the voting stock and exercised active control or if one 
party owned 20 percent or more of the voting stock. This approach results in some 
firms being classified as having neither diffuse nor concentrated ownership. Because of ` 
the necessity of avoiding sample attrition, we simplify this approach by classifying 
firms as either having diffuse ownership or not having diffuse ownership. 

For most firms, management compensation is explicitly or implicitly a function of 
some earnings-based performance measure. The emphasis on earnings encourages 
managers to meet earnings targets but, as pointed out by Healy (1985), managers do-not 
have an incentive to exceed targets by a large margin. Using prior year earnings per 
share as a measure of an earnings “target,” Ayres (1986) finds that firms choosing early 
adoption of SFAS 52 (which increases income) had a smaller percentage growth in 
earnings when compared to late adopters. The implication for our study is that firms 
with smaller growth in earnings are more likely to overstate earnings. 

We measure the growth in earnings as the difference between earnings from con- 
tinuing operations in the year of the overstatement (corrected for the overstatement) 
minus earnings from continuing operations reported in the previous year, scaled by 
total assets. 


Debt and Incentives to Overstate Earnings 


Firms that violate debt covenants can be subjected to costly recontracting. In order 
to avoid these costs, firms that are close to violation of covenants have an incentive to 
overstate reported earnings. Further, firms not currently bound by covenants that wish 
to borrow in the future have incentives to maintain a history of covenant-related vari- 
ables that would be evaluated favorably by prospective creditors. Duke and Hunt (1990) 
found that several alternative leverage measures capture both the existence and tight- 
ness of several common debt covenant restrictions.* Using leverage as a proxy for 
closeness to covenant constraints, Bowen et al. (1981) and Dhaliwal et al. (1982) found 
that higher leveraged firms are more likely to choose income-increasing accounting 
methods. In this study, we measure leverage as the ratio of total liabilities to total assets 


* Use of actual debt covenant data is preferable. However, only three of our treatment firms issued public 
debt and disclosure of their private debt contracting terms is limited. 
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net of intangibles. The measure was truncated at one for five firms with negative stock- 
holders’ equity. 


GAAP Versus Overstatement Errors 


Both GAAP alternatives and overstatement errors can be used to enhance earnings. 
We anticipate that the relative benefit of overstatement errors depends, in part, on a 
firm’s ability to change to income-increasing GAAP procedures. One reason for choos- 
ing to overstate earnings in response to economic incentives is that the firm has rela- 
tively few income-increasing GAAP accounting alternatives available. To measure the 
relative ability of a firm to-increase earnings via adopting alternative GAAP methods, 
we use ratios developed from the classification scheme in Zmijewski and Hagerman 
(1981), where evidence is‘ reported that managers follow an overall accounting choice 
strategy. In particular, Zmijewski and Hagerman examined four accounting procedure 
choices: inventory, depreciation, the investment tax credit (ITC), and the amortization 
of past service pension costs. They assumed that FIFO, straight-line depreciation, flow- 
through treatment of ITC, and amortization periods of more than 30 years for past pen- 
sion costs are income-increasing methods. LIFO, accelerated depreciation, the deferral 
method for ITC, and amortization periods of 30 years or less for past pension costs are 
income-decreasing methods. The firms in our sample were assigned a score of 1 for 
each income-increasing method used, 0 for each income-decreasing method used, and 
0.5 if they used a combination of methods or used an accounting method not included 
in Zmijewski and Hagerman as either income-increasing or income-decreasing. If a 
particular procedure did not apply (e.g., the firm had no inventory), no score was as- 
signed for that procedure. These scores were then summed and divided by the maxi- 
mum score if the firm used all of its income-increasing procedures. The resulting ratio, 
ranging from 0 to 1, measures the relative extent to which the firm has already taken 
advantage of available income-increasing GAAP alternatives. 


The Control Environment 


The probability that either intentional or unintentional errors are committed is 
reduced by controls that increase the probability of detection. Controls may result in 
detecting and correcting the unintentional errors before financial statements are 
released. With respect to intentional errors, the increased probability of detection 
increases the expected costs of misrepresentation since detected intentional errors 
reduce a manager’s value in the market for managerial labor. Thus, controls reduce the 
tendency to deliberately overstate earnings. 

Auditing is an important mechanism for controlling manager behavior, but the 
value of auditing as a control mechanism is unlikely to be the same for all audit firms. 
DeAngelo (1981) suggests that “larger audit firms supply a higher level of audit 
quality.” The rationale for this argument is that larger audit firms have a greater eco- 
nomic interest in assuring that financial statements are free from undetected errors. 
Thus, we expect that clients of Big Eight auditors are less likely to have overstatement 
errors. 

Audit committees are an important element of a firm’s control environment, and 
we conjecture that an audit committee reduces the likelihood of overstatement errors. 
Finally, larger firms might be expected to have stronger internal controls than smaller 
firms (Mautz et al. 1980). This assertion is consistent with the findings of Kreutzfeldt 
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and Wallace (1986) that smaller firms tend to have a greater frequency of auditor pro- 
posed adjusting entries during their annual audits than do larger firms. Using firm size 
as a proxy for the strength of a firm’s internal controls leads to the expectation that 
larger firms are less likely to have overstatement errors.* 


II. Sample Selection and Approach to Data Analysis 
Sample Selection 


Our treatment group consists of firms that made corrections of earnings overstate- 
ment errors that existed in a prior year’s annual report. Our sample is chosen from the 
National Automated Accounting Research System (NAARS) data base and Accounting 
Trends and Techniques (ATT). The NAARS data base contains annual reports for ap- 
proximately seven years for approximately 4,100 public companies, and the search was 
performed in late September 1989. ATT was examined for the years 1977 (the year in 
which APB 20 became effective) through 1988. To identify potential treatment firms, 
we searched for firms with footnote disclosure of prior period adjustments. Several 
NAARS key-word strings were used because the wording of the relevant footnote dis- 
closures is not uniform.® 

We were able to identify 35 firms from NAARS and six firms from ATT with adjust- 
ments for previous overstatements of earnings.’ These firms are listed in the appendix. 
A control group of 41 firms was randomly selected from NAARS.® The control firms 
were chosen such that they had a representation across years similar to the treatment 
firms. Subsequent financial statements filed by the control firms were reviewed to 
ensure that they did not contain a prior-period error correction. 

Nineteen of the 41 treatment firms had corrections affecting more than one prior 
period. Of the firms with multiple year restatements, we include in the analysis only the 
earliest year for which an overstatement error is known to have occurred. Of the firms 
in the treatment group, four also corrected understatement errors for one of the 
restated years, and two had cumulative adjustments for corrections to financial state- 
ments in periods prior to the year used for our tests without disclosing the accounts af- 
fected. Because we did not have details of the accounts affected for these two firms, we 
could not use the earliest year in which an overstatement occurred. 

The treatment firms appeared to be evenly distributed across the seven broad in- 
dustry classes used by the SEC with the exception of the transportation, communica- 
tion, electric, and sanitary industry class. While only two treatment firms are in this 
category, there are seven control firms. 

Corrections of errors were found for years from 1976 through 1987. The time 
elapsed until the corrections were made ranged from less than one year {for two firms 


* A problem with using firm size as a proxy for internal control strength is that several accounting choice 
studies have used size as a proxy for political costs. Possibly, larger firms are scrutinized more closely and 
suffer greater penalties if they are caught “cheating.” If this is the case, then they are less likely to overstate 
earnings. 


é The exact word strings used are available from the authors upon request. 

7 While we are only able to identify 41 firms with overstatement errors, conceivably, there are many firms 
with overstatement errors that are not detected and reported. Generalizing from our sample data to this specu- 
lated set of firms is left to the reader. 

2 The size of the control group was restricted due to the cost of hand collecting data from annual reports, 
10Ks, and proxy statements for each firm. 
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Table 1 
Description of Variables 


OVERSTATE =coded 1 if firm overstated earnings, 0 otherwise. 
DIFFUSE =coded 1 if no one owner controls five percent or more of the firm’s common stock, 0 other- 
wise. 


LEV=ratio of total liabilities to total assets net of intangibles. For five firms with negative stock- 
holders’ equity, the measure was truncated at 1. 


GROEARN=current year’s earnings from continuing operations minus prior year’s earnings from con- 
tinuing operations scaled by total assets. 


ALTGAAP =ratio of income-increasing GAAP alternative procedures used to the total available for use by 
the firm. 


AUDBIG8=coded 1 if sample firm is audited by a Big Eight auditor, 0 otherwise. 
AUDCOM =coded 1 if sample firm had an audit committees, 0 otherwise. 
fn{ASSETS)=natural logarithm of the book value of total assets. 


The variables are measured in the year of the earnings overstatement. All of the financial variables are re- 
stated for correction of the overstatements. ` 


that filed amended statements prior to filing the subsequent year’s annual report and 
did not mention the correction in the subsequent year’s report) to four years. A descrip- 
tion of each of the overstatements is presented in the appendix. 


Approach to Data Analysis 


The variables used in the analysis and discussed earlier are defined in table 1. We 
use approximate randomization tests as our primary approach to comparing the treat- 
ment and control firms. This approach proceeds as follows. First, logistic regression is 
used to estimate the coefficients in the following model: 


OVERSTATE = bo +b, DIFFUSE +b, LEV + bs GROEARN + bs ALTGAAP 


Predicted Signs (+) (+) (—) (+) 
+b;AUDBIG8+b;,AUDCOM +b, fn[ ASSETS) +e. (1) 
(-) (>) (-) 


Second, the dependent variable observations are randomly reordered (shuffled), and 
the logistic regression model is reestimated a large number of times (999 in this study). 
For each independent variable, this yields a distribution of coefficients under the null 
hypothesis of no association with the dependent variable. The 999 observations of each 
coefficient are compared to the estimate of the coefficient obtained from the logistic 
‘regression model estimated without random reordering. The significance level of a co- 
efficient is determined to be: 


(NGE + 1)/({NS +1), (2) 


where NGE is the number of coefficients greater than or equal to the nonrandom esti- 

mate of the coefficient, and NS is the number of random shuffles of the dependent vari- 

able observations. If the hypothesis predicts a negative sign, the number of coefficients 
less than or equal to the nonrandom estimate replaces NGE in equation (2). 
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The approximate randomization approach is useful in this setting because it has 
less restrictive assumptions than do conventional classificatory techniques such as 
logistic regression. Further, it is appropriate even when the sample is not random 
(Noreen 1989, 84), in which case the question being addressed is: For the treatment and 
control firms in the sample, are all permutations of the dependent variable relative to the 
independent variables equally likely? 


III. Results 
Descriptive Information on Errors 


The footnote disclosures explaining the overstatements tended to be brief and unin- 
formative. In 30 cases, no explanation was given at all; in seven cases the overstatement 
was attributed to some type of calculation error; and in four cases the footnote implied 
that the overstatements were intentional. The references to intentional overstatements 
were usually indirect, using terms such as ‘financial irregularities,” ‘unauthorized 
transactions,” and “fictitious entries.” In 32 of the cases, there was no mention of the 
circumstances that led to discovery of the misstatements. Four cases mentioned in- 
house investigations, four mentioned SEC initiated corrections, and one indicated the 
overstatement was discovered when a customer’s refusal of an inventory shipment re- 
vealed a significant amount of defective inventory. Examination of the management 
discussion section of the 10-K in the year of the restatement did not reveal any signifi- 
cant information beyond that contained in the footnotes. Most discussions merely ref- 
erenced the appropriate footnote. 

A search of the LEXIS data base disclosed that an SEC release was issued pertain- 
ing to seven of the treatment firms.’ The releases varied in their elaboration of the 
causes of the misstatements, but all asserted at least some degree of fraud. The ‘‘penal- 
ties” prescribed in the releases generally consisted of consent decrees enjoining the 
officers of the company from future violations of the Securities Acts, without the 
officers admitting or denying guilt. In one case, the company was also ordered to form 
an audit committee, and in another case to have an independent auditor review the sys- 
tem of internal control. A search of the Dow Jones News/ Retrieval Service (DJNRS) re- 
vealed an additional instance where fraud was alleged. In total, using footnote disclo- 
sures, a search of the LEXIS data base and a search of the DJNRS, we identified ten 
treatment firms with public disclosures attributing overstatements to management 
fraud. 

_ For four firms, there is some indication that the errors were not deliberate attempts 
to increase earnings with the knowledge that GAAP was being violated. In the case of 
Bindley Western, a news release (Dow Jones News Wire, 16 March 1987) indicated that 
the independent accountants concluded that the errors were inadvertent. Kinder-Care 
indicated that they only agreed to restate financial results after the SEC staff disputed 
the advice of their independent auditor (Peat Marwick Mitchell & Co.) as to the appro- 
priate way to account for conversion of convertible subordinated debentures into com- 
mon stock (The Wall Street Journal, 27 August 1984). Telesphere noted that while their 
current auditors, Touche Ross & Co., approved of the restatement, their prior auditors, 
Peat Marwick Main & Co. disagreed with the restatement (Dow Jones News. Wire, 22 


? Two of these firms also had footnote disclosures indicating overstatements were intentional. 
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Table 2 
Magnitude of Overstatements 
(n= 41) 
Standard l 
Mean Deviation Max Median : Minimum 
Overstatement (in $000s) $3,223 $7,611 $44,000 $701 $30 
Overstatement as a . 3 
percent of assets 1.6% ` 3.0% 13.4% 0.6% 0.1% 


March 1988). In the case of Zitel, management claimed that a restatement was needed 
only because their auditors, Coopers & Lybrand, reinterpreted an accounting principle 
applicable to moving a facility (Dow Jones News Wire, 10 August 1987). In a prior 
period, Zitel leased a new facility prior to the conclusion of their existing lease. Accord- 
ing to management, the original interpretation of the applicable accounting principle 
was that the two leases could be aggregated for accounting purposes. The reinterpreta- 
tion was that the transactions must be accounted for separately. The effect was to 
charge the remaining liability of the original lease to the last fiscal year, reducing re- 
ported income. 

In contrast, some news articles called into question the motivation of managers. 
For example, while the managers of Eastern Gas refused to discuss motives related to: 
- errors, analysts speculated that managers may have been trying to overstate earnings to 
increase their bonuses {The Wall Street Journal, 2 March 1984). They noted that coal 
companies (the line of business of the subsidiary involved with the error) frequently 
offer substantial bonuses based on cost reduction. In the case of McCormick and Co., it 
was noted that the errors occurred during a period when the grocery-products division 
(the division involved with the error} was experiencing increasing pressure to turn a 
profit (The Wall Street Journal, 1 June 1982). In the case of Stauffer Chemical Co., a sig- 
nificant part of the restatement was related to a plan to accelerate sales to dealers. This 
plan followed the realization that agricultural chemical sales would fall off sharply dur- 
ing the year of the error. The SEC charged that the plan was equivalent to consignment 
‘sales resulting in an overstatement of profit. As quoted in The Wall Street Journal (14 
August 1984), a government official noted that: “Their business was down and they 
wanted to accelerate sales.” 


‘Magnitude of Errors 


The descriptive statistics, reported in table 2, show that overstatement errors have a 
mean of $3,223,000 with a range from $30,000 to $44,000,000 and a median of $701,000. 
The distribution is highly skewed to the right. The relative materiality of the errors can 
be measured by scaling the errors by total assets.*° When scaled by total assets (after 
restatement), the errors are also skewed to the right and have a range of 0.1 percent to 
13.4 percent and a median and mean of 0.6 percent and 1.9 percent, respectively. 


10 Scaling by earnings is less informative because ten (13) of the treatment firms reported losses before 
(after) correcting for the overstatements. 
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Table 3 
Correlation of Overstatements and Growth in Earnings 
(n= 41) 
Pearson 
Correlation 
Coefficient p 
Overstatement and growth in earnings” — 0.764 0.00 
Overstatement and growth in earnings 
(before extraordinary items and 
discontinued operations} — 0,781 0.00 


“ The growth in earnings is measured without the effect of the overstatement. 


Relationship Between Errors and Earnings 


Table 3 reports the correlation between the earnings overstatements and the 
growth in earnings. The growth in earnings is computed as the difference between as if 
(corrected) earnings in the overstatement year and earnings of the preceding year. The 
correlations of — 0.764 and —0.781 measured after and before extraordinary items and 
discontinued operations are significantly different from zero at p<0.001, indicating 
that the greater the decline in earnings, the greater the magnitude of the overstatement. 
While inconclusive, these significant correlations are consistent with managers at- 
tempting to understate poor earnings performance. An alternative explanation for this 
correlation is that randomly generated overstatement errors are more difficult to detect 
when earnings are declining. This could result from applying reasonableness tests 
during the audit process being geared toward detection of deviations from the prior 
year. 


Comparison of Treatment and Control Firms 


Table 4 presents descriptive statistics for the treatment and control firms. Continu- 
ous variables are tested using Wilcoxon two-sample tests, and categorical variables are 
tested using Fisher’s exact tests (Siegel and Castellan 1988). All differences in means 
and medians between the treatment and control groups are in the expected direction. 
Differences with respect to GROEARN and ALTGAAP are significant at p<0.01. Differ- 
ences with respect to AUDBIG8, AUDCOM, and én(ASSETS) are significant at p< 0.05. 
The differences with respect to DIFFUSE and LEV are not significant at the 0.10 level. 

Approximate randomization test results are presented in table 5. For purposes of 
comparison, conventional tests of logistic regression coefficients are presented. The 
approximate randomization tests indicate support for four of the seven variables at 
p<0.05. The variables DIFFUSE, GROEARN, ALTGAAP, and AUDCOM are significant 
at conventional levels. LEV, AUDBIG8, and fn( ASSETS) are not significant. All coeffi- 
cients have the expected signs. Tests conducted using conventional logistic regression 
yield significance. levels that are very close to those obtained using the approximate 
randomization procedure. 

1 The Wilcoxon two-sample test should not be confused with the more common Wilcoxon test applied to 


paired observations (which is obviously inappropriate for the data presented here). See Hollander and Wolfe 
(1973) for a discussion of the assumptions underlying both tests. 
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Table 4 
Descriptive Statistics and Univariate Tests 


Descriptive Statistics 
{A} Treatment (B) Control 
(n=41} (n=41) 
Mean Mean 
Median Median One-Tailed 
‘Expectation Variable (Std. Dev.) (Std. Dev.) Probability* 
A>B DIFFUSE 0.195 0.146 0.385 
0 0 
(0.401) (0.358) 
A>B LEV 0.647 0.617 0.219 
0.896 0.821 
(0.234) (0.234) 
A<B GROEARN — 0.039 0 .024 0.001 
~ 0.003 0.009 
| (0.121) (0.088) 
A>B ALTGAAP 0.820 0.680 0.008 
0.633 0.867 
(0.199) (0.280) 
A<B AUDBIG8 0.732 0.927 0.019 
1.000 1.000 
(0.449) (0.264) 
A<B AUDCOM 0.756 - 0.927 0.033 
1.000 1.000 
(0.435) (0.264) 
A<B (ASSETS) 18.37 19.39 0.014 
18.18 18.94 
(1.79) (2.11) 


* Wilcoxon two-sample tests for continuous variables, Fisher’s exact tests for categorical variables. 





Table 5 
Approximate Randomization Tests 

(n= 82) 

Explanatory Predicted One-Tailed 
Variables Sign Coefficient t-statistic Probability* P(Ran)* 

Intercept n/a 3.374 0.878 nla n/a 
DIFFUSE + 1.456 1.8968 0.029 0.020 
LEV + 1.288 0.856 0.176 0.165 
GROEARN — — 17.222 — 2.429 0.008 0.001 
ALTGAAP + 2.078 1.575 0.058 0.047 
AUDBIG8 wg — 0.719 — 0.847 0.1989 0.154 
AUDCOM bg — 1.851 —2.114 0.017 0.007 
fn(ASSETS) e —0.200 —1.026 0.153 0.144 


* Assessed using conventional logistic regression approach. 
+ P(Ran)= probability assessed using an approximate randomization procedure. 
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Reduced Sample Tests 


As indicated above, ten of the overstatement errors were associated with some 
allegation of fraud. To explore the effects of these observations, the ten “fraud” firms 
and ten randomly selected control firms were excluded and approximate randomiza- 
tion tests were performed on the reduced sample. The results for the reduced sample 
(not reported here) were quite consistent except for AUDBIG8, which became signifi- 
cant at the 0.06 level in the analysis of the reduced sample. 


IV. Summary 


We present evidence in this paper that public companies that had overstated earn- 
ings in annual reports differ systematically from others. Specifically, firms that over- 
state earnings tend to have diffuse ownership, have lower growth in earnings and 
relatively fewer income-increasing GAAP alternatives available, and are less likely to 
have audit committees. In addition, the overstatements are negatively correlated with 
the growth in earnings. 


Appendix 


Description of Overstatement. Errors 


Engineering Measurements 





Firm Error 

Acton Corp. Equipment cost not depreciated or charged against sale of 
equipment. 

Agway Inc. Misapplied city tax rate in computing effective tax rate. 

Arnold Industries Incorrectly accounted for investment in limited 
partnerships. 

Atlantic American Overstated deferred acquisition costs. 

Bindley Western Industries Errors in reporting interest income and investment 

transactions. 
Bouton Corp. Inventory overstated due to compilation error. 
© Cache, Inc. Inventory overstated. 

CCA Industries Errors in computing accruals. 

Comserv Revenue overstated due to undisclosed customer 
agreements and cutoff. 

Eastern Gas and Fuel Inventory overstated and liabilities understated. 

Eastmet Corp. Errors in computing pension termination gain and 


ptcy costs. 
Inventory computation error. 


Fleming Companies Accounts receivable, notes receivable, and inventory 
overstated. 

Founder's Financial Actuarial calculation errors. 

Inter-Regional Financial . Receivables overstated. 

JWT Group — Revenue overstated and costs understated. 

Keystone Camera Error in computing loss from discontinued operations. 

Knogo Corp. Errors in recording leasing revenues. 

Kinder-Care Learning Convertible debt transaction charged to paid-in capital 
instead of income. 

Magic Circle Energy Error in computing depreciation and amortization on gas 
production 

Magnetics International Inventory overstated in connection with 4 ‘merger. 

McCormick & Co. Advertising expense understated. 

Mizel Petro Resources Error in computation of deferred tax. 
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Murphy Oil Reinsurance transactions should have been treated as 
financing arrangements. 

NCA Corp. Accounts receivable errors due to inconsistencies in 

, timing of revenue recognition. 

New Brunswick Scientific Inventory overstated. 

North East Insurance Errors in accounting for reinsurance contracts. 

Oak Industries Errors in accounts receivable, liabilities, and investments. 

Ocean Drilling and Exploration Reinsurance transactions should have been treated as 
financing arrangements. 

Ohio Ferro-Alloys Incorrect calculation of gain on pension termination. 

Pepsico Inventories overstated, liabilities understated. 

Plasma-Therm Interest expense and purchases not recorded. 


Poloron Products 


Inventory and sales tax errors and unrecorded customer 
rebates. 


Portec > Inventory overstated and sales cut-off errors. 

Seaport Corp. Inventory calculation errors. 

Sovereign Error in calculating accounts receivable and income taxes. 
Stauffer Chemical Inventory and sales cut-off errors. 

Telesphere International Mechanical errors in computing accruals. 

Tipperary Corp. Error in tax benefit of discontinued operations. 

Zitel Corp. Failed to record abandoned lease liability. 

Zondervan Corp. Inventory, sales cut-off, and accounts payable errors. 
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Shane R. Moriarity, Editor 


Editor’s Note: Two copies of books for review should be sent to Professor . 
Shane Moriarity, School of Accounting, University of Oklahoma, Norman, 
OK 73019. The policy of the Review is to publish only those reviews solic- 
ited by the Book Review Editor. Unsolicited reviews will not be accepted. 


J. E. BORITZ, Approaches to Dealing with Risk and Uncertainty (Toronto: The Cana- | 
' dian Institute of Chartered Accountants, 1990, pp. xxiii, 132, CAD $19.50). 


This book is a Research Report commissioned by the Canadian Institute of Chartered Ac- 
countants (CICA) to address several recommendations made by the Macdonald Commission 
regarding the disclosure of risk and uncertainty in financial statements. The foreword to the 
work makes a disclaimer of sorts stating that research reports are not primarily. “intended to 
stimulate thought, discussion and debate on matters of accounting and auditing theory and 
practice.” Nonetheless, this reviewer found that the book does exactly that. 

The book considers risk due to uncertainty about four major areas in financial reporting: 
Nature and Role of Financial Statements, Nature of Business Operations Portrayed in the Finan- 
cial Statements, Limitations of Financial Statement Measurements and Disclosures, and Man- 
agement’s Motives and Intentions. The book covers this broad range of topics, each with an 
extensive academic and practitioner literature of its own, in only 132 pages by sticking to the 
main points and using a pungent and direct style. For example, when discussing the disclosure 
of contingent liabilities resulting from litigation the author observes: 


One of the key issues related to the disclosure of litigation is the fear that acknowledging It, espe- 
clally estimating expected losses due to litigation, may disclose an enterprise’s litigation or settle- 
ment strategy to current or would be GEES thereby compromising that very strategy. This is a 
thorny problem Ip. 84). 


The author has obviously taken care to make the book readable to those unfamiliar with the 
literature in this area. He has been successful in the effort. The book includes a very useful 
Glossary of Terms. All of the mathematical notation is kept out of the text and confined to the 
Glossary or footnotes. The book is very well organized and structured and includes an Executive 
Summary. Unfortunately, it lacks an index. 

One of the dangers of covering such a vast and complex subject in a short book is over- 
simplification of the issues. When reading the book there were times when I was left wanting a 
more detailed discussion, but this problem is mitigated by the extensive footnotes and sugges- 
tions for further reading. 

While, on the whole this work sticks to business, it does wax philosophical at times and not 
just in the quotations at the start of each chapter. I found particularly enjoyable the rather jocu- 
lar consideration of whether Heisenberg’s Uncertainty Principle applies to accounting in addi- 
tion to quantum mechanics. 

The book accomplishes its stated primary objective of making recommendations to the 
CICA in response to particular Macdonald Commission recommendations. While one might not 
always agree with each of the 15 recommendations, one cannot fault the author for not making 
his arguments known clearly and unambiguously. This is required reading for anyone involved 
in shaping the future of Canadian financial reporting standards. The book So many insights 
relevant to standards in the United States and internationally as well. 
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Boritz writes with practitioners in mind. He performs a service by summarizing the results. 
of a large body of academic research in a manner that shows that it has important implications 
for standards setting. The clear style of the book also lends itself to classroom use. Selections 
from this book would be valuable in an accounting theory course. Researchers might also find 
this book enlightening because it shows their work being used in a policy making situation and 
because it calls for further research in many areas. 

JOHN P. WENDELL 
Assistant Professor of Accounting, 
University of Hawaii 


T. E. COOKE, An Empirical Study of Financial Disclosures by Swedish Companies 
(New York: Garland Publishing, Inc., 1989, pp. xii, 381, $85.00). 


_ This volume would appear to be for those who are interested in disclosure issues. It is, but it 
is also much more than that. The stated objective of Cooke’s research is to assess variability in 
disclosure in Swedish corporate reports; however, in setting the scene for his empirical work 
the author devotes more than one-third of the text to an extensive description and analysis of the 
factors influencing accouuting and financial reporting in Sweden. Anyone interested in the rela- 
tionships between accounting and its environment will want to read this. 

Cooke’s analysis of the Swedish reporting environment is structured around a modification 
of Radebaugh’s conceptual framework of the major factors influencing the development of ac- 
counting. Cooke’s coverage is exhaustive and illuminating. He paints a fascinating picture of a 
surprisingly contradictory reporting environment. On the one hand Sweden is one of the most 
multinationalized countries in the OECD area; on the other its corporations maintain a distinc- 
tion between “free” and “restricted” shares which effectively prevents a hostile takeover by a 
foreigner. In the last decade institutional dominance of the Stockholm Exchange has increased 
dramatically (the percentage of total market value owned by private individuals declined from 50 
percent to 23 percent); yet, the number of Swedes owning shares has risen sharply and the per- 
centage of the population owning some kind of shares is the highest in the world. The nation 
has substantial accounting-related legislation and a number of standard setting bodies (the 
Accounting Standards Committee, the National Accounting Standards Board, the Industry 
Committee on Stock Exchange Matters, and the Federation of Financial Analysts); yet, the 
manner in which standards are set is somewhat mysterious. “One view is that if an accounting 
practice is adopted by a major Swedish company that procedure becomes part of general prin- 
ciples. There is considerable validity to this argument” (p. 156). 

Against this rich background Cooke undertakes a content analysis of 90 Swedish corporate 
reports. These include unlisted companies, corporations listed only in Sweden, and entities 
listed in more than one nation. The aggregate disclosure index is based on an evaluation of 224 
variables. The inclusion of unlisted companies is highly unusual, and the set of variables 
examined is far broader than that used in most disclosure studies. Further, Cooke not only eval- 
uates aggregate disclosures, but he also investigates voluntary disclosures and social responsi- 
bility disclosures. The independent variables examined in relation to all categories of disclo- 
sures include: quotation status, size, parent company relationship, and industry group. 

Cooke concludes that quotation status is the most important independent variable in ex- 
plaining the variability in disclosure indexes. Listed companies consistently disclose more infor- 
mation than unlisted companies; companies with multiple quotations disclose significantly 
more information than those listed solely in Sweden. Size is an important factor, but whether 
the measure is in terms of total assets, annual sales, or number of shareholders does not matter. 
The independent variables important in explaining variability in the aggregate disclosure 
indexes are also important in explaining the variability in the voluntary disclosure indexes and 
social responsibility indexes. 

In his data analyses the author supplies numerous examples from actual corporate reports 
and a wealth of details on the specific features of financial reporting. The analyses are further 
enriched by constant comparisons of Swedish practices with those in the U.K. and the U.S. 
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The volume does suffer from a few flaws. There is some unnecessary repetition. Several 
items cited in the text are missing from the bibliography, and the page layouts could be greatly 
improved in many cases. Nonetheless, this work deserves a wide audience, which should 
include both those interested in disclosure issues and those intrigued by accounting and its 
culture{s). 

JUDITH RAMAGLIA 
Associate Professor of Accounting 
Pacific Lutheran University 


PHILLIP G. COTTELL and TERRY M. PERLIN, Accounting Ethics: A Practical Guide 
for Professionals (Westport, CT: Quorum Books, 1990, pp. xii, 171, $39.95). 


This book presents a refreshing approach to ethics and professionalism for the accounting 
student. Instead of focusing primarily on the professional code of conduct, the authors focus on 
an ethical systems approach to ethics in accounting utilizing situations and ethical dilemmas 
often faced by accountants. The authors, one an accounting professor and one a professor of 
interdisciplinary studies, combine to produce a book which does an excellent job of integrating 
ethical issues in accounting into an ethical system framework for decision making. 

The dominant ethical systems of utilitarianism, deontologism, and ethical realism are intro- 
duced at the beginning of the book in a concise, understandable presentation. A perspective on 
the focus of each system is given with comments and criticisms. These systems provide students 
with a framework for analyzing ethical situations and are utilized by the authors in the discus- 
sion of accounting issues in all chapters. 

Moral dilemmas facing accountants are introduced in Chapter Two with a broad discussion 
of professionalism and codes of ethics. The meaning of professionalism, the AICPA Code of 
Conduct, and the Codes of Ethics of the NAA and Institute of Internal Auditors are presented to 
set the background for conflicts resulting from professional obligations and responsibilities. 
Moral dilemmas resulting from code obligations such as confidentiality versus the public inter- 
est are discussed using the ethical systems framework to illustrate inherent conflicts in duties. 

The remaining eight chapters expand on specific issues facing accountants. Topics include: 
independence, whistle-blowing, mentoring, illegal actions, harm to society, professional rela- 
tionships, the accounting community, male/female relationships, and social responsibility ac- 
counting. The professional responsibilities under the Code of Conduct are defined in all appli- 
cable cases, but the discussions go far beyond the code. Topics such as mentoring, professional 
relationships, the accounting community, and gender issues should provide students with prac- 
tical insights into the personal and professional! relationships and potential moral situations they 
will encounter in their professional careers. These issues are a very important part of the busi- 
ness world and are seldom discussed in accounting courses. The inclusion of these topics is 
extremely important in preparing students for careers in the accounting profession. 

Discussions of whistle-blowing, illegal actions, and potential harm to society of actions give 
the students a broad perspective on issues involving right versus wrong and on the resulting 
personal consequences. Exposure to these topics should help students cope with related issues 
if they arise in their future careers. The ethical systems approach of analysis used provides a 
basis for decision making the students will be able to transfer to future ethical dilemmas. 

The book concludes with a social responsibility accounting section giving a broad overview 
of responsibilities of business in society and of the accounting profession to report on business 
actions. Implications for the expanded role of professional accountants in the measurement and 
reporting of business actions are discussed using both utilitarian and deontologic approaches, 

The discussions in each chapter are centered around real-world ethical examples and ethics 
cases are included at the end of each chapter for student analysis. These cases are well done 
and present realistic ethical issues and moral situations faced by accountants. Analysis of the 
cases Ís suggested using the ethical systems framework developed early in the book. 

This book provides a good approach for incorporating ethics into accounting courses and 
would be appropriate as a supplemental text for advanced level accounting courses or for use in 


Book Reviews 659 


a separate accounting ethics and professionalism course. The issues raised in the book and the 
ethical systems approach could be integrated throughout the accounting curriculum to give stu- 
dents a broader perspective on the ethical issues facing accountants. 

JOANNE W. ROCKNESS 

Associate Professor of Accounting 
- North Carolina State University 


JOHN DONALDSON, Key Issues in Business Ethics (London: Academic Press, 1989, 
pp. xxiv, 227, $19.95, paper). 


Many accountants consider ethics like the weather, “everyone talks about it, but no one 
does anything about it.” Some call “business ethics” an oxymoron. Others stress that “Good” 
ethics like good clothes must be used frequently.” In these days of business fraud and audit fail- 
ure, no subject is more timely than ethics. Donaldson’s book reviews many ethical theories, 
points out their similarities and inconsistencies, develops a general framework for discussing 
ethical issues, and applies the results to resolving several ethical dilemmas. 

The book first traces the history of ethics, focusing on the works of such thinkers as Plato, 
Hobbes, Locke, Kant, Rousseau, Hume, Mill, Marx, Mayo, Taylor, Weber, Argyris, and Gal- 
braith. Donaldson analyzes and applies ethics to the British and American economic systems. 
Given his complex writing style and terse summaries of many difficult issues, the reader needs a 
fairly strong liberal arts background to appreciate Donaldson’s arguments. 

Published in Great Britain, the book gives adequate coverage to the American economic en- 
vironment. Intended as a text in a business ethics course, the author competently compares and 
contrasts rival ethical theories. He focuses more on the process of identifying and evaluating 
these theories than on developing their content. While he integrates these theories very well, the 
reader must have some familiarity with the ideas to appreciate the stated nuances. Donaldson 
. stresses the dangers of “convenient” capricious actions, arguing that students need to develop 
strong ethical frameworks, even though they may occasionally violate their own principles. 

The author then helps the reader apply these great works in order to develop a personal ethi- 
cal framework. He argues that all frameworks should be based on the Golden Rule Do, behav- 
ing unto others as though you were the other), but allows for the existence of other moral frame- 
works. Curiously, Donaldson traces the Golden Rule to various philosophers and to the Sermon 
on the Mount without mentioning Rabbi Hillel in 100 B.C. and the prophet Mohammed. 

The author deftly uses ethical theories to analyze business dilemmas, but he does not solve 
them. His case analyses and graphical taxonomies present novel ways to understand complex 
problems. While this discussion relates primarily to business and economics, its application to 
accounting can easily be implied. 

The author is to be commended for clearly separating his personal viewpoints from his cited 
ones, for “leading” the reader to the next section of the arguments presented, and for develop- 
ing a fine glossary. On the other hand, his writing style seems complex and the number of tangi- 
ble applications, especially in the accounting arena, seems sparse. 

In summary, this book seems well suited for liberal arts majors wishing to obtain an under- 
standing of business ethics. However, the book is not suited to the usual structure by which 
ethics is taught at most business schools. It is some consolation to realize that with the impetus 
of the Bedford Committee, the Treadway Commission, the American Assembly of Collegiate 
Schools of Business, and the Big Eight White Paper, we may soon gen accounting majors with 
the backgrounds necessary to appreciate this book and accounting faculty prepared to teach it. 

ALAN REINSTEIN 
Professor and Chair 
Department of Accounting 
Wayne State University 


M. J. R. GAFFIKIN, Accounting Methodology and the Work of R. J. Chambers (New 
York: Garland Publishing, Inc., 1989, pp. x, 236, $50.00). 
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This book offers a detailed and comprehensive analysis of Chambers’ works from a philos- 
ophy of sclence/social science perspective. Gaffikin focuses the treatise on a methodological 
analysis, rather than the actual content of Chambers’ writings. The manuscript appears aimed at 
providing recognition for Chambers’ contribution to the accounting literature, despite the lack 
of acceptance for his Continuously Contemporary Accounting (CoCoA) theory. 

The text begins with an overview of the notion of methodology from a general philosophy of 

science perspective. This is followed by a review of the evolution of the methodology concept 
and its application in the development of economic theory. The author carefully delineates dif- 
ferences in methodological approaches in the physical and social sciences. The author also does 
a good job in providing a careful and precise historical analysis of the state of the art of the role 
of methodology in economic theory development, although some might criticize the selectivity 
of material reviewed. Unfortunately, many readers will find this material too laborious, prevent- 
` ing them from arriving at the more central, and more interesting theme of analysis of Chambers’ 
work which is presented in later chapters. 
-The later chapters analyze Chambers’ CoCoA within this framework. If there is a flaw in 
Gaffikin’s analysis, it comes at this stage. Readers are asked to accept that “. ... there can never 
be a methodology that can be held up as being more scientific, more Ee than others and, 
therefore, one to which- disciplines such as economics should aspire. If this is significant for 
economics, then it is equally significant for accounting. Accountants have drawn heavily from 
economics” (p. 68). Even though the analysis up to this point tells us that the methodology of 
economics cannot be evaluated purely from a philosophy of science perspective, readers are 
expected to simply accept that accounting theory should be analyzed from the same view as 
economics. A more convincing argument on the similarities between the two disciplines 
appears necessary. 

Gaffikin does a convincing job painting a portrait of Chambers as an empiricist, although 
many modern researchers, who view empiricism as working with large computerized databases 
might take exception to this characterization. Readers who overcome the semantic bias on em- 
piricism, will find a thought provoking description of Chambers’ work. Gaffikin’s analysis of the 
evolution of CoCoA from Chambers’ voluminous books, lectures, and other writings is enlight- 
ening, and is recommended reading for contemporary monists among accounting researchers. 

This work will not appeal to a large number of readers of this journal. However, it is a 
volume worthy of inclusion in university libraries, and would make interesting reading in a 
a seminar offering a philosophy of science approach to accounting theory construction and 

ysis. 

JOSEPH H. ANTHONY 
Associate Professor of Accounting 
Michigan State University 


SHARON H. GARRISON, The Financial Impact of Corporate Events on Corporate 
Stakeholders (Westport, CT: Quorum Books, 1990, pp. ix, 182, $39.95). 


Events studies lend themselves very nicely to inductive research framed in Sir Karl Popper’s 
epistemology. An established theory, in this case the Efficient Markets theory, is subjected to 
repeated tests that might refute it. Explanations of confirmed refutations should enhance our 
understanding of the theory and perhaps even improve it: A deeper understanding of the Effi- 
cient Markets theory would be quite useful for all of the various parties interested in the finan- 
cial affairs of corporations. 

Professor Sharon H. Garrison’s work falls somewhere in the middle of that whole process. 
She has reviewed a large number of studies, including a few of her own, in which the stock 
‘market did not seem to anticipate various events. A review of this type can be an important con- 
tribution to our literature because inductive research usually consists of a multitude of small 
studies scattered across many topics and published in many different outlets. In the absence of 
survey studies, many corporate stakeholders may be unaware that certain types of events might 
temporarily disrupt market efficiency. 
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Garrison’s book actually consists of two parts, the first consisting of a general review of 
investment analysis and the second her survey of events studies. Roughly 70 percent of the book 
S oe to the investment analysis review, which is surprising given the avowed objective of 

er wor 

In the first part of the book, Garrison provides some examples of “events,” reviews basic 
stock valuation models, describes various sources of financial data, and presents an extended 
arithmetic example of events studies methodology. Her basic hypothesis is that only changes in 
anticipated earnings can change the fundamental values of stocks. Changes in risk class seem to 
play no role in her model; indeed, CAPM risk variables are shunted off into an appendix. The 
only accounting materials in this part of the book are an appendix on financial ratio analysis 
and another on funds flow analysis. Unfortunately, the funds flow materials cover the working 
capital approach rather than the cash flow statements now being reported. The data source 
descriptions, while comprehensive, suffer from a curious omission of computerized data banks 
such as COMPUSTAT or CRSP. Overall, these materials read well, but their general clarity is 
SCC obscured by a plethora of appendixes, a few of which are just reprints of journal 
articles. 

In the second part of the book, Garrison reviews the findings of over 60 events studies and 
‘Includes two reprinted studies in appendixes. These studies range over a variety of topics, in- 
cluding: proxy fights; dividends and repurchases; executive deaths; mergers and divestitures; 
strikes; ratings changes; debt swaps;. capital expenditures; product recalls; recommendations 
and opinions; index and exchange listings; and investigations and crimes. While all of those 
events could certainly: affect earnings in some general fashion, none of them would seem to 
affect accounting earnings directly. Only one study, involving qualified audit opinions, dealt 
with accounting as such. Aside from a few anomalous results in proxy fights, executive deaths, 
and investigative journalistic events, the overall results of these studies were plausible: “good” 
events were followed by positive abnormal returns and “bad” events were followed by negative 
returns. Also, the way in which these events were managed, such as product recalls, could 
affect the raturn outcomes as well. Consequently, anyone interested in the financial affairs of 
corporations should keep an eye on the underlying probabilities of such events and monitor 
their ensuing development once they occur. 

Overall, Garrison’s book makes its greatest contribution in the events studies review. The 
various bibliographies at the end of each chapter include many studies that might escape the 
attention of researchers consulting only “first-line” journals. For that reason, I believe that it 
would be a useful addition to business school libraries. However, I cannot envisage its specific 
use in accounting courses because it does not cover the impact of accounting events on security 
returns. 

JAMES O. HORRIGAN 
Professor of Accounting and Finance 
University of New Hampshire 


DONALD L. MADDEN and JAMES R. HOLMES, Management Accountants: Respond- 
ing to Change—An Exploratory Study (Montvale, NJ: National Association of Ac- 
countants, 1990, pp. xvii, 63, $13.95, paper). 


This study, sponsored by the Research Committee of the NAA, tells of the changes taking 
place in the field of management accounting. Based on in-depth interviews of executives in a 
number of firms encompassing a wide variety of industries, it is interesting and comprehensive, 
yet concise. These features allow the reader to receive a wealth of information on an important 
and timely topic without an inordinate expenditure of time. 

Chapter 1 provides an overview of the study. It states as the study’s objective, “to discover 
and explain what accountants are doing to satisfy management’s changing decision needs” 
(p. 1). The authors indicate that this objective is based on the premise that in order to compete 
effectively, U.S. companies have instituted major innovations in the practice of management 
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accounting. The authors investigated three separate areas of activity—{1) positioning manage- 
ment accounting resources; (2) the development of new performance measurements and special- 
ized computer applications; and (3) priorities in the training and development of management 
accounting personnel. The design of the study included a literature review, field interviews, and 
telephone interviews. Each of the remaining three chapters focuses on the study’s results as 
related to each of the three activities mentioned above. 

Chapter 2 relates to the positioning of management accounting resources, with positioning 
defined as “the placement of resources at locations where decisions are executed” (p. 13). There 
is a focus on the changing responsibilities of management accountants. In Chapter 3, the au- 
thors indicate how firms are responding to changing data needs, with emphasis on improved 
measurements for performance evaluation and concern about the effectiveness of software re- 
sources, especially in the areas of internal control and documentation. Chapter 4 focuses on 
what firms are doing to provide professional development (including ethical issues) for manage- 
ment accounting personnel in this changing environment. 

The results of the study are written in a descriptive format. That is, the authors have sum- 
marized the results of their extensive interviews. In addition, included throughout the book 
are ten scenarios. These are specific situations from individual firms that relate to management 
accounting issues. Each scenario describes the setting, the need, and (he actions of a specific 
management accounting problem. The scenarios are interesting and help the reader to under- 
stand specific problems and how firms solved them. 

The results described in Chapters 2, 3, and 4 were obtained from field interviews and tele- 
phone interviews, which are the heart of the study. They make the book unique regarding the 
topic of the new role for the management accountant. The field and telephone interviews con- 
sisted of a number of open-end questions (that are included as Appendix B and Appendix D, 
respectively) to top-level executives. The number and diversity of the respondents is impressive, 
The field interviewees consisted of 24 executives, mostly controllers at the division and corpo- 
rate level, from 15 different unidentified companies that included manufacturing, jewelry retail- 
ing, coal and energy, banking, electrical power, restaurants, and conglomerates. The list of 
telephone respondents is equally impressive. This part of the study consisted of 20-minute inter- 
views with executives, again mostly controllers, from 24 companies that included 22 different 
types of firms. | 

The authors have made a significant contribution to the literature of management account- 
ing with this study. Anyone with an interest in the changes that are taking place in management 
accounting should read this book. This study could also serve as an excellent supplement in 
advanced undergraduate or graduate managerial courses for accounting majors. It would pro- 
vide students with a background on what to expect as they enter the profession of management 
accounting. 

PIERRE L. TITARD 
Professor of Accounting 
The University of Alabama in Huntsville 


PAUL J. MIRANTI, JR., Accountancy Comes of Age: The Development of an American 
Profession, 1886-1940 (Chapel Hill, NC: The University of North Carolina Press, 
1990, pp. xi, 275, $29.95). 


Given that few books in the field of accounting are published by distinguished university 
presses, it becomes obvious that this volume is noteworthy if for no other reason than the fact 
that it fell into a niche normally reserved for the most significant scholarly research. 

The book addresses the history of the public accounting profession in the United States be- 
tween 1886 and 1940. Although the primary emphasis is on various organizations of public 
accountants, there is also a comparison to the activities of professional organizations repre- 
senting other disciplines. The first chapter, in particular, relates the history of the accounting 
profession to such professions as law, medicine, and engineering. All of these professions faced 
the dual problems of gaining both internal cohesiveness and external acceptance. Later chapters 
go into great detail about the development of the public accounting profession. The book would 
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be of great interest to the accountancy trivia buff. For instance, how many are aware that 
Charles Waldo Haskins was a nephew of Ralph Waldo Emerson and could trace his ancestry 
back to the Mayflower. 

One point that should be mentioned is that the research is largely based on secondary 
sources, particularly the earlier research of James Don Edwards, John Carey, Norman E. Web- 
ster, Gary Previts, and Barbara Merino. Although the view that primary documents should be 
the major resource for accounting historians usually takes precedence, there is nothing wrong 
with rereading existing material and addressing different questions to it. This is what Miranti 
has done. In effect, this work readdresses the work of approximately 800 prior authors. Miranti 
interweaves all of this material well and brings to the forefront materials that are not widely 
known. He also weaves in conclusions from such nonaccounting sources as John Higham’s 
Strangers in the Land, L. Frank Baum’s Wizard of Oz, and the programs of the Sons of the Amer- 
ican Revolution. The emphasis was made that polarization in the profession during the early 
period derived not only from contrasting views about technical accounting matters, but also 
from the differing values and outlooks about the major events affecting American society. 

The publication of the bibliography is a valuable contribution in itself. The bibliography, in 
very small print, occupies 34 pages and includes about 800 entries. Over 80 dissertations are 
cited—most dealing with some aspect of accounting history during the period covered by the 
study. Although the bibliography is quite lengthy, there are only a few references cited from 
after 1980. Even though later references would represent secondary sources, it is obvious that 
this work was little updated from the author’s doctoral dissertation written at Johns Hopkins 
University. Research published during the past decade has offered new insight into some of the 
events discussed by Miranti. Such criticism does not detract from the overall tone of the study. 

In summary, this book represents a most comprehensive overview and explanation of the 
development of the public accounting profession in the United States. It definitely should be 
read by every doctoral student in accounting, and perhaps every master’s student as well. With 
a background of the material included in this volume, students would better appreciate the trials 
faced by the early leaders of the accounting profession. 

DALE L. FLESHER 
Professor of Accountancy 
University of Mississippi 
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embrace any research methodology and any accounting-related subject, as long as the articles meet the 
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CALIFORNIA STATE UNIVERSITY, LOS ANGELES is seeking applicants for tenure-track 
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needs are in financial (auditing emphasis). Accounting firm experience and professional certifica- 
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teaching, research and service. Cal State L.A. is an Equal Opportunity/Affirmative Action Em- 
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OLD DOMINION UNIVERSITY. The Accounting Department of the College of Business and 
Public Administration is seeking applications for an anticipated tenure-track position at the Assis- 
tant Professor level. Performance evaluation at Old Dominion University is based upon effective 
classroom teaching, productive research and publications, and university and community service. 
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Equal Opportunity, Affirmative Action Employer. 
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, SYNOPSIS AND INTRODUCTION: Eron arise yin measuring changes in 
‘prices of assets due to imperfection and incompleteness of asset markets. — 
_ Furthermore, the rates of price-change, and the magnitudes of errors of 
measurement vary and are often correlated across assets. Suppose we 
characterize an economy by means and variances of price changes for 
, individual ‘goods and of measurement errors in these changes as well as by. 
the degree of diversification in the asset portfolios held by individual firms. 
In such an economy, the linear valuation rule that yields the most efficient 
_ estimate of change in the economic value.of these asset portfolios is the ` 
‘one that minimizes the mean squared error (MSE). This paper presents a ` 
linear aggregation model of valuation to help understand how the minimum 
MSE valuation rule. is affected by. various parameters that characterize the - 
economy, and the circumstances under which historical-cost valuation rule 
yields a (statistically) more precise estimate of the unobserved economic. 
value of firms’ assets than the current valuation rule. The analytical findings 
_of.the paper are consistent with the reluctance of accountants to depart 
"` from historical cost in spite of the existence of low inflation, and in spite of 
scholarly critiques of this valuation rule by Chambers (1966), Edwards and’ 
Beil (1961), Sterling (1970) and others. They are also consistent with the 
use of specific price indexes by most firms to prepare SFAS 33 disclosures. 
Several testable implicatons of the results are provided. | 
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A direct comparison of the characteristics of valuation rules Is - 
complicated by the heterogeneity of the decision contexts in which 
accounting numbers are-used. We use the mean squared error (MSE) 
between the principal value and its various estimators to rank the latter. 
Using this criterion, previous simpler models that ignore the presence of 
measurement errors In price changes have shown that the use of 
Increasingly detailed price indexes yields more précise valuation; current 
valuation is the most precise valuation rule because it uses the most 
detailed set of indexes (Sunder 1978). We show that this basic result does 
not hold when the measurement of price changes is subject to errors. As 
the magnitude of these measurement errors increases relative to the 
magnitude of price changes, the most accurate valuation rule requires a 
less detailed set of price indexes. A key Implication of this result is that the 

_ existence of inflation or deflation is not sufficient for general-price-level 
valuation, specific-price-index valuation, or current valuation to dominate 
historical-cost valuation as an estimator of the economic value of. firms’ 
assets. Historical-cost valuation is dominated by others only when the 
magnitude of price changes are large relative to the errors of measurement 
in price changes.. ` 


Key Wo rds: Valuation rules, Valuation errors, Price indexes, Current 
valuation, Inflation accounting. 


ra 


l HE remainder of this article is organized in four sections. Section I of the paper 
summarizes a model of valuation rules as linear aggregations and some key 
- . results from prior work that.assumes the absence of measurement errors. In sec- 
tion II, efficient sets of estimators in the presence of measurement errors are analyzed 
and it is shown that, under specific assumptions about parameters, the marginal 
~- reduction in MSE from increasingly specific price indexes keeps decreasing. In section ` ` 
HI we derive conditions under which historical-cost valuation and some specific price- ` 
index valuation rule, are globally minimum MSE estimators. Section IV presents em- ` 
Geier testable implications and suggests directions for further research, — 


L. Valuation Rules as Linear Aggregations `. 


By analyzing environments where price changes are measured without error, 
Sunder (1978) shows that the bias and MSE? associated with a valuation rule Ry are 
given br: 


Bias(R»)=E,E,(Rua— HB, lest (1) 


i For statistical analysis of valuation rules as linear aggregates, see Ijiri (1967, 1968) and Lev (1969). For 
ranking valuation rules by mean squared error criterion see Hall (1982); Hall and Shriver (1990); Ijiri and Noel 
(1984); Shriver (1986, 1987); Sunder (1976, 1978); Sunder and Waymire (1983, 1984); Tippett (1987); and 
Tritschler (1989). When markets are incomplete, it is not usually possible to rank unambiguously the values of — 
various asset. portfolios from the view point of all agents (Beaver and Demski 1979). However, if the 
performances of these agents are defined over means and variances of returns from these baskets, different 
. valuation rules can be ranked unanimously in spite of incomplete markets (Eckern and Wilson 1974; Radner 
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‘To explain the notation briefly (see Sunder 1978 for details), we observe that the : 
model assumes n distinct goods in the economy. The relative abundance of these goods 
is specified by an unchanging vector w of relative weights.? Because the elements of w 
sum to 1, oss DL, implies that the value of good iin the economy amounts to 10 percent 
of the total value of all goods in the economy. These weights are used to construct price 
indexes from prices of individual goods or assets. The vector of relative weights for the 
asset portfolio of individual firms, w, has a multinomial distribution with parameter w,? 
Vector r denotes the relative price changes (inflation or deflation) for n goods during an 
arbitrary time interval 0 to 1; u and E are expectation and variance, respectively, of r. 
Let S(n,k) be the number of distinct ways in which a set of n goods can be partitioned 
into k subsets.‘ Taking weighted average (using weights w) of price changes for goods 

. included in each subset produces k price indexes. When a good is included in the asset ` 
- portfolio of a firm as well as in a given index u, the firm adjusts the historical cost of the 
good on its books by this price index to arrive at its estimated current value. Percentage 
change in the sum of the estimated current value of all assets in the Droa portfolio, 
relative to its historical cost, is denoted by R,, if Ath of the S{n,k) possible k-index valu- 
ation rules (ordered in some arbitrary manner) is used to estimate the current value. 
Note that when k=n or k=1, $(n,k)=1, which means that there is only one possible 
way each of partitioning n goods into n and 1 partitions, respectively; use of the former 
yields current valuation rule denoted by R„,ı, while the use of the latter yields general 
price-level valuation (GPL) denoted by PR... In addition, we could denote historical- 


(2) 





1974). Also, in an economy-wide sense, neither overvaluation nor undervaluation is desirable because both 
lead to a suboptimal resource allocation. Since mean square error metric deals with errors in both directions, it 
is not an unreasonable choice for a loss function. 

_ ? Unchanging ø (and w to be defined below) imply that this valuation model takes the asset composition of 
firms and the economy as given, it does not try to explain the dynamics of change in asset composition and 


prices. 

3 In effect, it is assumed that the asset portfolio of each firm is egastucied E independent draws 
with replacement from an urn in which each asset is represented by a different colored ball and the relative pro- 
portion of each color is given by w. Thus, each firm is statistically identical in the sense that its asset basket is 
drawn from the same distribution. This structure ignores the dependencies that may exist across firms leg, 
among firms within an industry). For analysis of valuation rules in industry-segmented economies, see Lim 
and Sunder (1990). From properties of the multinomial distribution, E(w)=o and Cov(w,,w,}=,{1—,)/p if 
i=j, and —w,w,/p otherwise. Parameter p can be interpreted as a diversification parameter; as p increases, the 
vector of asset proportions for individual firms statistically converges to the economy-wide vector of asset 
proportions w (see Feller 1968 and Sunder 1978). . 

s This is Stirling Number of the Second Kind given by: 


(-1}{k-j)" 
Sin ki F 
| 7 Ee Dek JIk- 
See Apostol (1967, 584). . 
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valuation rule by Ry, because there is only one possible way of using no price indexes. 
For all other values of k, there are S(n,k}>1 ways of constructing k price indexes. g is 
the vector of diagonal elements of ©, and g is the vector of squared elements of u. w, and 
ü„ are vectors containing those elements of w and g, respectively, that are included in 
the uth price index, and E. is the submatrix of © consisting of all those rows and 
columns that correspond to the goods contained in the uth index. Finally, ẹ is a vector 


of unit elements with appropriate length. A numerical example of valuation using rules ` 


of this class is given in Lim and Sunder (1990,.171—2).. 

Sunder (1978) shows that bias (1) for all estimators of this diss is zero. Mean 
squared error (2) monotonically decreases in fineness’ of the partition used in com 
structing Ra. It immediately. follows that.the MSE is largest for the coarsest estimator 
(GPL valuation, R,,) and is smallest for the finest estimator (current valuation R, ;). 


The intuition behind this result is that the use of coarser price indexes in valuation | 


causes a greater mismatch between the relative weights assigned to changes in price of 
individual goods, and the relative proportions in which each good may be held in the 
actual asset portfolios of individual firms. The MSEs of other linear valuation rules lie 
between these two extremes. 

: We can add historical-cost valuation to this family by defining it as a DECH 
valuation, Ro,,#0, because historical valuation of an unchanging basket of assets does 
not change over time. The historical-cost valuation is the only member of the family 
- with a nonzero bias. In absence of measurement errors, its MSE in equation (4) is 
ge SE than the MSE of the other E valuation rules (see a 1978). 


 Bias(R,, ls E ECH Geen RH. E —w’ 2 k, l (3) 


and | 
| e De 1 | 
MSE(Ro,1)=E,E,(Ro1—Ra,1 = 2 ler (1-5) Etwa. (4) 


The latter property of the valuation rules can feadily be seen under label MV in the 
sixth column of table 1, which lists the MSE values, associated with all 16 possible valu- 
ation rules in a four-good economy for a numerical example with the following param- 
eters:” 


e relative weight of each good in the economy, w’ =(0.2, 0.3, 0.4, 0.1); 
e expected price change for each good in the economy, g’ =(0.1, 0:2, 0.6, 0.4); and 
e covariance matrix of relative price changes for goods jw the economy, 


ê The total number of distinct estimators in this class for an eet economy is given by: 


: KC atn, k) 
kel 
where S(n,k) is as given in footnota 4. 

6 Price index set x is finer than price index set y if and only if all the goods contained in each one of the price 
indexes of set x are also included in some price index of set y. For a five-good economy, for example, the index 
set {(1.2), (3), (4), (5)} is finer than {(1,2), (3,4), (5)} but is not comparable in fineness to {(1), (2,3), (4, 5)}. Note that 
increasing k does not n y mean a finer valuation rule or a smaller mean squared error. 

” Although this numerical example uses zero covariances, the model does not require them to be zero. 
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Table 1 
Properties of Estimators of Current Value in a Four-Good Economy: Example 1 


d O ae 3 4 5 S 6 o7. 
“Economy-Wide Average Errors 
x -Serial — a 
k . Sin H Number. Partition MSE(Ra) = MV + MRŠ) 
0 m — ‘Historical 1.260 +2.214/p* 1.260 +2.214/p* Dr 
d 1 0 GPL{abcd) 0.66+2.214/ p* 2.214! p* 0.66* 
2 7 1 (a,bed) 0.886 +2.380/ p 1.765/p 0.68 +0.615/p 
o l 2 (b,acd) 0.66 +2.174/ p 1.205/ p 0.66 +-0.969/p 
3 (c,abd) 0.66 + 1.329/p* 0.756! p* 0.68 +0.573/p 
4 (d,abc) 0.66+2.162/p - 1.911 p -0.66+0.251/ p* 
5 (ab,cd) 0.66+1.704/p 1.044/p 0.66 +0.660/p 
6 {ac,bd) 0.66 + 1.968/ p 1.21Up 0.66 +0.757/p 
7 (ed bei 0.68+2.155/p ` 1.605/p ` 0.66 +0.550/p 
3 6 1 (ab,c,d) 0.68 + t.381/ p* 0.481/p 0.66 +0.900/ p 
2 {ac,b,d) 0.86 + 2.040/ p -0.8337 p 0.66+1.207/p 
3 (od. bc 0.66 + 1.704/ 9 0.0297/p* 0.66 + 1.407/ p 
4 (a,be,d) ` 0.66+2.282/p 1.399/p 0.66 + 0.883/ p* 
5 (a,bd,c) 0.66 + 1.668/ p 0.378/ p 0.66 + 1.290/p 
6 (a,b,cd) 0.66 + 2.063/p . 0.663/p . 0.66+1.500/p 
Current . SC 
4 1 1 Valuation 0.66 + 1.740/ p* Or 0.66 +1.740/p* 
(a,b,c,d) 


* Denotes members of the efficient set and efficient frontier. 
~ k=number of price indexes in the partition (valuation rule), 
Sin, k)= number of ways of partitioning a set of n distinct elements (four in our example) into k subsets 
(price indexes), 
\=partition identifier, . - 
p=diversification parameter. See footnote 3, _ 
MSE(F,,)=total mean squared error of valuation rule Ru, 
MH. = movement error of valuation rule H. and 
EE meastirement error of valuation rule H. 


1000 | | 
0300 a | 
005 0\' 7 (5) . 


. These definitions of the family of valuation rules and a summary of their known 
econometric properties set the stage for an analysis of their properties in the presence 
of measurement errors. 


II. Efficiency of Valuation Rules Under Price Movement and Measurement Errors 


Two factors give rise to the error of valuation rules—price changes and errors in 
measurement of price changes. We shall label the errors arising from these factors 
movement (MV) and measurement (MR) errors, respectively. The magnitude of both 
types of valuation errors is influenced by the divergence between the relative weights of 
various assets in the portfolios of individual firms on one hand, and the effective 
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relative weights assigned to their respective price changes by various valuation rules on ` 
the other, Greater divergence increases the movement error but decreases the measure- 
ment error. Under.current valuation, each good is in a price index by itself, and the two 
sets of weights are identical. This identity of actual and effective weights reduces move- 
ment error to its minimum but takes the measurement error to its maximum. As fewer, 
more aggregated, price indexes are.used in valuation, movement error increases; the 
measurement error, on the other hand, decreases as errors in measurement of individ- 
_ual price changes included within the same index tend to cancel each other. Economet- 
rically efficient valuation rules are found by examining the effect of aggregation on the 
sum of these two types of errors. 
~- The error of measurement in ‘the price relatives of individual goods arises from sev- 
eral sources. Sampling error, substitution bias, and quality error cause a theoretical 
. price relative to differ from its empirical measures. `- | 
Sampling is an ‘important source of efror in price indexes since Ee are esti- 7 
~ mated from samples, not from entire populations. The data for construction of price in- ` 
- dexes are gathered almost entirely from a network of samples: samples of products, of 
localities, of reporters, and of points in time. Therefore, the value of a price index de- 
pends on the particular samples. For example, the Producer Price Index (PPI) is con- 
structed by collecting approximately 10,000 quotations for 2,800 commodities in pri — 
‘mary markets of the United States every month (Bureau of Labor Statistics 1976). Even 
though probability-proportional-to-size sampling technique cut the sampling error in 
the PPI as of August 1978, (Early 1978, 1979), error cannot.be avoided entirely. ` 
Since. both the Consumer Price Index (CPI) and the PPI are. based on a fixed-weight 
formula (Laspeyres Index) and thus substitution je not considered, there exists an 
inherent upward bias due to substitution: a price index.may remain the same while real 
prices fall because of the arrival of cheaper substitutes of similar quality. Empirical 
work shows the substitution bias to be relatively small (Triplett 1975). © 
_ Another source of error in price indexes is the change in quality of goods. If quality 
improves ‘over time, the index may have an upward bias. But Triplett (1975) showed 
that quality error is not necessarily upward sloping and the sign of quality error is not 


` determined because price indexes are explicitly designed to attempt to adjust for 


quality changes. Since the optimum quality adjustment procedure is unknown, it is dif- 
ficult to measure the quality error. Furthermore, many price quotations used to con- 
struct. indexes are not. subject to bargaining and ‘thus are not always equal to actual 
` transaction prices (Price Statistics Review Committee of the National Bureau of Eco- 
~ nomic Research 1961, 70). 

We analyze the efficiency of valuation rules when errors | by both price 
movement and price measurement are present. Let £, the observed price relative vector, 
be tbe sum of unobserved true price relative r and measurement error e F=r+¢, where 
` E(r)=g, Var(r)=E(r—z)(@—z)’ st, E(e)=0, Var(e)= Elec’ IA, Cov(e, deg Ze 0, 
and ~ ee variables measured with error? > 


' © We assume EE de measurement errors are unbiased; E(e)=0. None of the DESCH results in 
tho article change when these errors have nonzero expectation of y. The quantitative effect can be obtained by - 
the following substitutions in the text: A by {A+ LA, by (Am + yup), and å by (+8) where è is the vector of 
squared elements of p. Hall and Shriver (1890) document empirical evidence suggesting that prices in the PPI 
-database may have an upward bias relative to a privately gathered database. 
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Following the notation used above, the mean squared difference between the esti- 
_ mator Ña and the measurement-errorfree current valuation: free of measurement 
errors, H, is:? 


MEA E SEA Raa) | d =) (8) 


° 1 S x UI NM E WM Li 
gd gl äert LE eg Latar ëi- A Ipelates 
p) "oe we P Pam © oie, 


“where A.. is a submatrix of A consisting of those rows‘and columns of A that corre- 
spond to elements included in the uth price index (see the appendix for derivation). 
In the special case when the measurement error is zero De, A=0), expression (6) is 

reduced to expression (2) because the first two terms of expression (6) drop out. There-. 
fore, expression (2) can be seen as the movement error (MV) component of the error of 
the valuation rule, Ba, Similarly, when there is no price movement error Do, „=0 and 
©=0), the last two terms of PS (6) drop out, leaving the measurement error 
(MR) component given by: | | 

kg. 

MEI. GZ ‘Aw KEE : (7) 

l Pp ` Pum Wue 
Proposition 1 states the result that the total MSE of valuation rules can be EES 
into two additive components: 


~ Proposition 1: The total mean squared error rof a SE a is Bereet, 
| into the sum of Gs movement error (MV) and its measurement error (MR): 


MSE (Ra)= MR(Řa)+ MV (Ñn). (8) 
An important property of measuremėnt error is given by the following theorem. 


Theorem 1: Measurement error (MR) of valuation rules increases monotonically 
with the fineness of the index system usedi in Geste . 


| Proof: See the appendix. `` 


Since measurement error- EIRIN with the Regie of a eal ten rule, it follows 
immediately that this error is highest for (he n-index De, current) valuation and lowest 
for the index (ie., GPL) valuation given by expressions. (9) and (10), respectively: 


MR, Je (E Aert ty’ "Bo | (9) 
p po | : 

. and | l 

Më, jsa Aw, ` p ! DO 


where 6 is the vector of diagonal elements of A. In addition, the historical cost valua- 
tion, being independent of Lis entirely free of valuation errors caused by measurement 
errors in f. 


"Note that A, is the measurement-error-free current valuation or principal aggregate which is unob- 
servable. In contrast H, , is the current valuation estimate of H, EES rice data. 
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Note that the effect of increasing the fineness of valuation rules on their measure- 
ment errors is opposite to its effect on their movement errors. Use of coarser price in- 
dexes reduces errors of measurement through diversification. However, coarse price 
indexes also make it more difficult to track the movement of prices. This is a key prop- 
erty of valuation rules and it plays a critical role in identifying the minimum MSE or 
efficient valuation rules. 


Efficient Estimators 


For any given value of k, there are a total of S(n,k) distinct estimators or valuation 
rules. Let Ñ,- denote the efficient k-index estimator, such that it has the least mean 
squared error of all k-index valuation rules: 


MSE(R,-)SMSE(Hy), for all \=1,2,...,S(n,k). (11) 


Definition 1: The Efficient set of estimators consists of those estimators whose 
mean squared error is not greater than the mean squared error of any other 
estimator with the same number of price indexes. Efficient frontier, H(k), 
k=1,...,n, is the set of mean squared errors associated with the efficient set of 
estimators: 


H(k)=MSE(R,), k=1,...,n. (12) 


For each k, the efficient set has one or more valuation rules because several k-index 
valuation rules may attain the minimum MSE. However, only one point corresponding 
to this MSE lies on the efficient frontier. In the illustrative numerical example, we use 
the following parameter values: 


RS =(0.1, 0.2, 0.6, 0.4), P= 


Then the complete set of estimators for the four-good economy consists of 16 elements, 
as shown in table 1. From these numbers, it is easy to verify Proposition 1 (decompos- 
ability of total error into MV and MR), Theorem 1 (MR’s monotone increase in the fine- 
ness of the estimator), and that the GPL valuation has the lowest measurement error of 
all estimators other than the historical-cost valuation, while the current valuation has 
the highest measurement error. Partitions that constitute the efficient set and their cor- 
responding mean squared errors on the efficient frontier are marked by asterisks in the 
fifth column of table 71. 

In the absence of measurement error (A=0), the efficient frontier decreases mono- 
tonically in k (see Sunder and Waymire 1983). This monotonicity can be verified from 
column 6 of table 1 and the efficient frontier marked A=0 in panel B of figure 1. This 
frontier in figure 1 is also convex in k (1<ksn) although convexity has not yet been 
proved for the general case. : 

When movement error is zero (u=0, £=0), it follows from Theorem 1 that the effi- 
cient frontier is monotonically increasing in k. This monotonicity can also be con- 


1 Since MR (Ña) is monotonically increasing in fineness and since there always exists a (k+ 1}-partition 
that is strictly finer than any given k-partition, it follows that the MSE of estimators in tho efficient set must be 
increasing in k when movement error is zero. 
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SES Figure 1 
Mean Squared Error of Valuation Rules: Example 1* 


Panel A. Total Error: 
Total 


' Error 

— 0.66 +2.0/p 
D +1.5/p 
0.66 + 1.0/p 
0.66 +0.5/p 
0.66 


0 





. Number of Indexes {k) 
Panel B. Movement Error Only (A=0): ) 


Movement 

_ Error 
2.0/p 
1.5/p 


1.0/p 


0.5/p 





DL 
0 1 | 2 3 . 4 
Number of Indexes (H 


` 


Figure 1 continues on next page. 


firmed from column 7 of table 1 and the efficient frontier marked n=0, £ =0 in panel C 
of figure 1. The efficient frontier in the figure is also convex in k {1 sk sn) though con- 
vexity of the frontier in general remains to be proved. 

In the general case when both movement and measurement errors are present, the 
efficient frontier is not necessarily the sum of the two efficient frontiers for each type of 
error. A valuation rule that is part of the efficient set for measurement errors is not 
necessarily in the efficient set for movement errors, and vice versa. A valuation rule 
can be in the efficient set for the total error even if it is not in the efficient set for either 
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Figure 1—Continued 
| Panel C: Measurement Error Only (RE =0} 
Measurement 
Error 
0.66 +2.0/p | 


0.66+1.5/p 


0.68 + 1.0/p 


0.66 +0.5/p 
0.66 


Le 
(aa ee ee 


0 E 2 2 3 4 
Number of Indexes {k) - 


* Each connecting line segment indicates fineness-coarseness relation between valuation rules. 





type of error we For example, when k=3, minimum movement, RE 
and total errors are attained by three different valuation rules ((ad,b,c), (a,bc,d), and 
(ab,c,d), respectively) as can be seen in table 1 and figure 1. The efficient set and effi- 
cient frontier for total errors is also shown in table 1 and panel A of figure 1. This effi- 
cient frontier also is convex in k (1sk<n) in the numerical example. 


Convexity of the Efficient Frontier 


If the efficient frontier for the total errors were ge to'be convex in k, the task of ` 
finding the efficient frontier would be simplified considerably." In all numerical exam- 
ples we have been able to construct so far, the efficient frontier is found to be convex. 
Sunder and Waymire’s (1983) search for an efficient frontier using the PPI data base 
yielded a highly convex estimate of the efficient frontier. Though the general proof of 
convexity has eluded us thus far, we have been able to prove the convexity of the effi- 
cient frontier in the special case when E and A are diagonal matrices and the economy- 
wide relative weights (w,) as well as the expected relative price changes (,) across all 
goods in the economy are identical. Under these assumptions, we first find a method of 
constructing the efficient estimator of the current value for a given value of k; second, 
. the efficient frontier corresponding to this set of estimators is proved to be convex; and 
_ third, the minimum of the efficient set (the global optimum estimator) is identified. 


 . ™ Convexity of the efficient frontier for each type of error (movement and measurement) is insufficient to 
ensure the convexity of the’ EES 
bound for the minimum of the sums. 
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Theorem 2: Let mean price relatives for all goods be equal (u=me, where m is a 
. scalar and eis a vector of unit elements), relative weights of all goods in the 
economy be equal 


(w SE ), 
and £ and A be diagonal matrices (0,=5,=0 for all i#j). Then the k-index effi- 
cient estimator of the current value is constructed by grouping those (n—k+1) 
goods that have the (n—k-+1) smallest algebraic values of (o,—65,) into a single 
price index and letting each of the other (k—1) goods with the larger algebraic 
values of (oy —5,,) be oe each in a price index: The MSE of this valuation rule 
fx is given by: 





k-i- sy OM 
nor (1-2) S002 bot Bas +E- 2 © (13) 


jai pM ta jut =kn— RS 
‘where B,=(oy—5y). 
Proof: See the appendix. 


Theorem 2 aigepata that sais with small variability in price movements ER are 
candidates for being lumped with other such goods; goods with larger variability in 
- price changes will contribute more to movement error unless they stand alone. In con- 
trast, goods with larger measurement error (6,) are the most attractive candidates for 
being grouped with-others in an index in order to maximize the benefits of oe 
tion through aggregation.. ` 

In our numerical example, the value of ae for the four SE is —2, —1, 4, 
and 0, respectively. According to Theorem 2, the most efficient two-index estimator 
-cani be constructed by including the three goods with the smallest algebraic values of 8, 
(a, b, and d) in.one index and allowing the fourth good c to be in an index by itself. The 
most efficient three-index estimator can be constructed by including the two goods 
with the smallest algebraic values (a and b) in one index and allows goods c and d to be 
each in an index by themselves. Table 1 and figure 1 confirm that (c,abd) is the efficient 
two-index estimator and (ab,c,d) is the efficient three-index estimator with SSES to 
total MSE. 

Expression (13) in Theorem 2 specifies the efficient frontier i in the presence of both 
movement and measurement errors. Note that the efficient frontier given in Shih and 
Sunder (1987) is a special case of expression (13) with A=0. In the cases covered by the 
theorem, the procedure for identifying estimators that are members of the efficient set 
is the same whether either type of error alone, or their sum, is being considered. 


' -Theorem 3; The efficient frontier specified by expression (13) is convex. 
Proof: See the appendix. 


= „The following two conditions must be satisfied for the- minimum of this convex 
SEH Hiel to be attained at k*: 
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Corollary 3.1: Under the assumptions of Theorem 3, the following two conditions 
are necessary and sufficient for the minimum of the efficient frontier to be at- 


tained at k=k*: 
cf Se 
a) Bug B (14 
9 bes Do Ce ) 
k*-i >. a - 15 
(b) 8 SES ën, l e (15) 


when puss have SEN arangi in the decreasing order of Se 
Proof: See the appendix. 


For the minimum of the efficient set to be attained at k*=1, it is sufficient (but not 
necessary) that £, be negative for all js. In other words, if the variance of measurement 
error exceeds the variance of movement error for all goods, the efficient frontier is 
monotonically increasing in k. A necessary and sufficient condition for attaining the 
minimum of the efficient frontier at k*=1 De, at GPL venation) can be obtained by 
substitution in expression (14): 


d g B nn a dw . 

s- rr D 

e, be 7 o mg 
were SE have been arranged such that 8,28, for all j>1. 

For the minimum to be attained at k* =n, it is sufficient (but not necessary) that 8, 
be positive for all js. In other words, when the variance of price relatives exceeds the 
variance of measurement error for every good, the efficient frontier is monotonically 
decreasing in k. The necessary and sufficient condition for attaining the minimum of 
the efficient frontier at k* =n De, at the current ge: Can be obtained ode substitu- 
tion in EE (15): 


Bet =o Bae | | ) ) (17) 


In summary, BEE historical-cost valuation, the GPL valuation provides the 
minimum MSE estimate of the unobserved current value of a firm’s assets when-condi- 
tion (16) is met. This counterintuitive result is obtained when the variance of errors in 
measurement of prices are large compared with the variance of price relatives. Under 
these conditions, the advantages of aggregation from diversifying random errors of ` 
measurement dominate the disadvantages of using a coarser index set to estimate the 
current value. 

Similarly, again excluding the jistoiicak- coat valuation: when condition (17) is met, 
the n-index system (with a price index for each good) is the minimum MSE estimator. 
This happens when the variance of measurement error is small relative to the variance 
of price relatives. 

- ‘More generally, when conditions (14) and (15) of Corollary 3.1 are met, neither the 
GPL nor the current valuation is the minimum MSE estimator. Instead, the minimum 
MSE of all valuation rules is obtained by using price indexes at some intermediate level 
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of aggregation, wists the benefits and disadvantages of aggregation are ‘balanced at the: 
margin. 

In the numerical example, conditions (14) and (15) are satisfied for an intermediate 
minimum but conditions (16) and (17) are not. To verify this, consider the four goods 
in decreasing order by the value of 8,=(0,—4,), which is c(5—1=4), d(2—2=0), 
b(3—4=—1), and a{(1—3=-—2). According to condition (16), in order for Oe single- 
index estimator (GPL) valuation) to be the minimum MSE estimator, the value of (oy—5y) 
for good c, 4, would have to be less than or equal to | l l 


0-1-2 
SECH b 


which is not true. In order for the global | minimum MSE estimator to bs attained at k=4 
(current valuation), it is sufficient that 6, be positive for every good, and this condition 
is not satisfied for goods a and b; The necessary and sufficient condition (17) requires 
that the value of (a, —6,,) for good b, (—1), be greater than or equal to the negative value 
for good a, —(—2). This condition. is not satisfied either. Conditions of (14) and (15) for 
the minimum of the efficient frontier are satisfied at k*=2 in the numerical example. 
This can also be confirmed from table 1 and figure 1. eg 


DI. Global Minimum Mean Squared Error Estimator 


In the absence of errors of measurement, the MSE of the historical cost estimator is 
necessarily greater than the error of all other estimators.. The top panel of figure 2 
shows the relationship of the error associated with historical-cost valuation to the error 
of the GPL valuation (k= 1), the current valuation (k=n), and other members of the effi- 
cient set of estimators that use more than one but less than n price indexes. The error of 
the historical-cost estimator exceeds the GPL error by o ki +g’ le, and the GPL.error 
exceeds the current valuation error by: _ 


| SËNN 


When price measurement errors are present but price movement errors are absent, - 
the relationship of the historical cost valuation (HC), the GPL valuation, the current 
valuation (CV), and other estimators are shown in the second panel of figure 3. The GPL 
- error exceeds the HC error (which i is zero) by w’ Aw. The CV error, in turn, exceeds the 
GPL error by: 


zd- — 4 "Aal 


In Proposition 1, we have dready shown that the DE measurement and move- 
ment errors of estimators are additive. Accordingly, Theorem 4-states the conditions 
for the historical-cost valuation to be a more accurate estimate of the current economic 
: value of the firm’s assets than the GPL or the CV estimators, respectively. 


Theorem 4: . : 
DI The. historical cost valuation dominates the general price level valuation in 
_ accuracy if and ony if: 
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Figure 2 , 
Mean Squared E Error of the Historical Cost Valuation Versus the Other Valuation Rules* 





‘Number of Indexes (k) 


* HC=Historical Cost Valuation, 
GPL=General Price Level Valuation, 
CV=Current Valuation, 


a= Movement Error {HC} — ~ Movement Error (GPL)=o’ (2+ pp’ Jo, l 

bes Movement Error dee Error (CV Lie Heiser kel, 

c= Measurement Error (CV}— EE Error (GPL) =F 0" b—w’ Ag), and E 
d=Measurement Error (GPL) — Measurement Error Ee 


Lim and Sunder—Efficiency of Asset Valuation Rules 683 


w' (A-E +u’ )o>0. 


(ii) The historical cost valuation dominates the current valuation in accuracy if 
and only if: 


1 1 
(1-2) o’(a-E—mu'w+ Zeen 
p H 


Proof: See the appendix. 


The historical-cost valuation can dominate-not only the GPL valuation and the cur- 
rent valuation in accuracy, it can also dominate the minimum MSE valuation rule iden- 
tified in Corollary 3.1. The numerical example given above is modified to illustrate this 
dominance (example 2). Let w, u, and E remain unchanged and allow the variance- 
covariance matrix of measurement errors, A, to be 2.6 times as large as previously 
assumed: 


Then the MSE for Example 2 are as follows: 


2.21 
Historical-cost valuation: 1. mua 


2.214 
General price level valuation: 1. OT 


4.524 


Current value valuation: 1.716 + , and 





2.214 
Minimum MSE estimator (HC valuation): 1.260 EE 


It is easily seen that the historical-cost estimator provides, in this plausible exam- 
ple, the closest approximation of the current economic value of firms as shown in table 
2 and figure 3. Also note that this dominance is valid for all values of the diversification 
parameter p. As the mean and variance of price changes, u and E, increase, it becomes 
less likely that the historical cost valuation will dominate other estimators. 


IV. Concluding Remarks 


Extant Gera of valuation, largely deterministic in nature, can be integrated into 
a single framework to facilitate direct comparisons of their statistical properties in 
specified economic environments. In this study, we have shown that when prices 
change and price data are subject to errors of measurement, neither the current- 
valuation rule nor the general-price-level valuation rule is necessarily the minimum 
MSE estimator of the unobserved economic value of baskets of assets. Instead, the most 
` accurate valuation is likely to be attained by using specific price indexes at an appropri- 
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Table 2 
Properties of Estimators of Current Value in a Four-Good Economy: Example 2 





1 2 ER d 8 6 . 3 
Economy-Wide Average Errors 

k Sin, H Number Partition ` MSE(Ña) = MV(Rin) + MR(R..) ` 

0 rm — '. Historical 1.280 +2.214/p* ` eege 214/p* o O* 

1. E 1 .GPL{abcd) 1.716 +2.214/p* 2.214/p* -  —s«-1..716* 

2 7 1 (abcd) ` 1.716+3.364/p:° >. =~ 1.765/p ` 1.716 +1:599/ p 
2 _ (b,acd) 1.716+3.724/p  .- .1.2085/p . 1.716 +2.519/ p 

WE?  IGabd 1.716 +. 2.246/p* 0.7561 9" 1716+ 1.490/p 

4 (d,abc} " 1.716+2.664/p `- 1.911/p 1.716 +-0.653/ p* 
5- {abcd} ` . 1.716+2.760p . `` = 1.044/p 1.718+1.718/p 
6 (oc, bd 1.716 4+3.178/p 1.211/p l 1.718 + 1.968/ p 
7. od bc 1.716 + 3.038/p . . 1805/p 1.716+1.430/p 

3 6 1 {ab,c,d). ` 1.716 +2.821/p* 0.481/p 1.716 + 2.340/ p 
2 (ac,b,d) 1.7164+3.871/p- ` 0.833/p ` 1.716 + 3.138/p 
3 -` {ad,b,c) 1.716 +3, '955/p 0.297/ p* 1.716+ 3.658/ p 
4 . {a,be,d) . 1.716 +3.695/ p . - 1,399/ 9 1.716 + 2.296/p* 
5 {a,bd,c} 1.716 + 3.732/p 0.378/p ` 1.716 + 3.354/ p 
5 (a,b,cd). 1.716 +4.463/p 0.563/ p _ 1.718 +3.800/p 

poh Current l À 

4 1. Bs Valuation 1.716 +4.524/p* be d 0 1.716+4.524/p* 

(a,b,c,d) 


* Denotes members of thé efficient set and efficient frontier. 
~ k=number of price indexes in the partition (valuation rule), 
Sin, k)= number of ways of Gierens a set ofn distinct elements (four: in our example) into k subsets 
l (price indexes), 
- A= partition identifier, l 
‘p= diversification parameter. See EEN 3, 
MSE(A,,)=total ‘mean squared error of valuation rule Ñin, - 
MV(fi,,)= movement error of valuation rule Ñn and ` 
MR(Rin)= measurement error of valuation rule Ria. 


ate level of aggregation.. Many firms implemented SFAS 33 using specific price indexos; 
_ it would be interesting to find how the levels of aggregation chosen by individual firms 
relate to the. theoretically efficient levels. . 
| We have shown that if the error of measurement in current prices. is ; sufficiently 
large, the ‘historical-cost valuation may provide a statistically more ‘accurate approxi- 
mation of the unobserved economic value of assets than is provided by the current valu- 
ation. We have also derived conditions under which the historical-cost valuation is a 
more accurate estimate of the unobserved current economic value of assets than the 
most accurate of all linear valuation rules tbat use price indexes. Apparently, which. ` 
_ valuation rule provides the best estimate of the current economic value is not a matter 
of theory or principle, but simply a matter of the relative magnitudes of parameters that 
characterize the economy: relative weights of various goods. in the economy (w), 
` expected percentage price change for individual goods ( g), and the covariance matrices 
_ of these price changes (X) and of measurement errors in price changes (A). : a 
‘These results have several testable implications. € Other ee being equal, valuation | 
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Figure 3 
- Mean Squared Error of Valuation Rules: Example 2* 
Panel A. Total Error: E S 
Total 
Error 


1.716+4.0/p. 
- 1.716+3.0/p 


1.716 +2.0/p 


1.716 +1.0/p 


1.716 


D 





Number of Indexes {k} 
Panel B. Movement Error Only (A=0} 


Movement | 





Number of Indexes (k) 
Figure 3 continues on next page. 


rules based on a finer set of price indexes will be more informative for those firms and 
industries: (1) whose assets have a larger mean rate of price change; (2) whose assets 
have a more variable rate of price change; and (3) whose assets are traded in relatively 
perfect and complete markets, permitting more accurate measurement of change in 
their price. Real estate, oil and gas deposits, films, videos, software, and patents are 
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Figure 3—Continued 





Panel C. Measurement Error Only (p=0, £0): 
Measurement 
1.716 +-4.0lp 
1.7164 3.0/ p 
1.716 +2.0/p 
1.716+1.0/p do | 
1.716 | ee 
0 Sg E SE E 
d i 2 i 3 4 


Number of Indexes {k) . 
* Each connecting line segment indicates fineness-coarseness relation between valuation rules. 


examples of assets whose prices may have large measurement errors. Current valuation 
of firms with large holdings of such assets may not be more accurate than their his- 
torical valuation. Most of the empirical work has focussed on cross-sectional analysis 
so far (see Beaver et al. 1980, 1982; Gheyara and Boatsman 1980; Ro 1980). Our model 
suggests that closer attention to the economic environments of specific firms and in- 
dustries will increase the power of tests to detect the information value of current value 
data reported by firms. 

Second, the model suggests that efficient valuation rules may be quite different for 
different firms and industries. During the same period of time, historical cost may be 
most informative for some industries, while GPL, a 10- or 100- index valuation rule may 
be the most informative for others. 

Third, the level of aggregation at which asset Satiiea are adjusted to their current 

estimates has a major impact. Since both ASR 190 and SFAS 33 granted wide latitude to 
individual firms in choosing the level of aggregation, different firms may have reported 
at quite different levels of aggregation, making it difficult to draw conclusions from 
cross-sectional studies that ignore this heterogeneity. 

Fourth, the degree to which extant accounting valuation practices approximate the 
efficient rules can be determined empirically by (1) estimating the numerical values of 
parameters of the model, (2) identifying efficient valuation rules, and (3) comparing 
such rules with the extant practice. Theorem 2 provides a guide to testing whether the 
_ Bureau of Labor Statistics’ scheme of clustering prices of individual goods into price 

indexes is an efficient one. Broadly speaking, the theoretical framework for valuation 
rules will make it possible to map the characteristics of valuation rules more precisely, 
and to conduct empirical tests of propositions about these characteristics. 
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Fifth, during periods of inflation, increasing the revaluation interval is likely to in- 
crease the magnitude of X +u’ (expectation of squared price relatives) faster than the 
magnitude of A (expectation of squared measurement errors in price relatives) if mea- 
surement errors are more likely to be diversified away across time periods. Under these 
conditions, current valuation is more likely to dominate over historical and GPL over 
longer intervals of persistent inflation (or deflation). 

We have used a number of assumptions about parameters and the model to derive 
these results. Some of these assumptions (e.g., those used in Theorem 2 to prove con- 
vexity of the efficient frontier) are mere stopgap measures, until somebody is able to 
prove (or disprove) the convexity of this function under more general conditions. Other 
assumptions are based on our belief that they allow us to obtain important qualitative 
results from a simple model without undue violence to the environment we seek to 
understand. Others may wish to modify such assumptions to capture the implications 
-of various refinements in the basic model. 

For example, we asssume that the asset portfolio of all individual firms has identi- 
cal distribution. Lim and Sunder (1990) drop this assumption and model an industry- 
segmented economy to examine the properties of valuation rules that utilize industry 
specific versus economywide price indexes. We also assume that the measurement 
errors in price relatives are unbiased. There is some empirical evidence. (see Swanson 
and Shriver 1987; Hall and Shriver 1990; Swanson 1990) that certain price databases 
may have an upward bias. The model can be modified to explore the impact of this and 
other such refinements (see fn. 8). We have assumed that the price data for all assets of 
all firms in the economy are included in constructing the price indexes used for valua- 
tion. Additional valuation errors could be generated when this assumption is relaxed in 
various ways. For example, it is rarely possible to sample price changes for all assets in 
an economy; even the Bureau of Labor Statistics resorts to sampling for only about 
3,000 different goods and services. Beaver et al. (1982, fn. 2) point out that changes in 
value of unrecorded assets goes unreported, and, as a practical matter, even some of the 
recorded assets may be excluded from revaluation. Further work would be needed to 
assess the magnitude of effect such deviations may have on predictions of the simple 
model presented here. 

There may be situations for which the expected error and expected squared error 
loss functions used in this study. may be considered inadequate. Computer simulations 
could be used to discover which of our results would generalize to other loss functions. 

Finally, the analysis here is limited to linear valuation rules. Lim (1990) derives 
properties of nonlinear (e.g., lower-of-cost-or-market) valuation rules using a similar 
framework. 
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Appendix 
L Derivation of the Mean Squared Error Expression (6). 
MSE( Ato) =EwE,E,(Bia—Ba1)?=E,E,E, (o t- w'r) 


x 
aPp= ae Gre 
ay g 
-w¥=total weight in w of goods included in the same index as good j, 
w7 =total weight in w of goods included in the same index as good f, 
Ge of os included in the uth index, and 


E and A,, are submatrices of X and A, respectively, consisting of rows and columns corresponding to 
the goods included in the uth index, 


EEE, Har) fF’ a +w’ ir EK w). (A.1) 


The right hand side of expression (A.1) is the expectation of the sum of the terms of an nxn symmetric 
matrix. The jth diagonal element of this matrix is: 











Ww} a, 
s Wy, (A.2)} 


Gr 


2 wi w,\* Zei. ch 
(nte) oe + WT; ~2{r,+e,)r, 
J 


where e=}; -rh 
Using the mutually independent distributions of w, r, and e specified earlier, the expectation of the jth 
diagonal term is given by: 


aed al SENET e 
P po p pof pof p 


‘ where p fs the number of multinomial draws used to construct w. 
Thei, jth off-diagonal element of the matrix, if both goods j and j are included in the same index, is given 




















(rte tew" wE -2# owne). f (A.A) 
Eh ws wf ; 
and its expectation with respect to w, r, and ¢ is given by: . 
ay RE -jif w; daa w; ay + (1-F) ow. - {A.5) 
Géi Dréi pur P 


_ Finally, the i, jth off-diagonal element of the matrix if goods i and } are not included in the same index 
under valuation rule ff,, is also given by expression (A.4) and its expectation is: 


LP ( A (5, Wy. (A.6) 


Adding up all terms of this matrix with some rearrangement yields the mean squared error (MSE) of a 
valuation rule ff, in the presence of measurement error: 


MSE( fin) = kk nee Le gbeg:  Loter 8. 15 Cal aw teetan (A.7) 
P Pumi "Gg Pp P as? wie 
IL. Proof of Theorems 
Proof of Theorem 1. 


The first term in equation (7) is a constant with respect to various index configurations. To show that the 
second term is monotonically increasing in fineness, the proof by Sunder (1978, 364—5) for movement error is 


D See footnote 3 for distribution of w. 
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directly applicable because A is a positive semidefinite i matrix. Proof of Theorem 1 follows imme- 
diately. l 
Q.E.D. 


Proof of Theorem 2. 
From equation (6), mean squared error of valuation rule Ri is: 


ve Du 21 Eloy tem) £ : GE Ek (A.8) 


jut pn gei „ fei 





ell ide E E 


DJ ja) Pn jet ph jai Ta 


where n, is the number of goods in the uth index and n, is the number of goods in the index in which good j is 
included. In equation (A.8), all terms except: 


zf? 
jar Ty 2 

are constant with respect to index configurations; therefore minimizing MSE is equivalent to maximizing: 
SÉ 


fei Ny 


for the valuation rule Ra and the best A,, given k be denoted by Af. Without loss of generality, let the n goods 
be ordered such that 6,281, j=1,2,...,(m—1). Since the n-index system is unique: 


At=A,.= D8). (9 


Je? 


For k=n—1, some two goods, indexed p and q must be combined into a single index while all other (n—2) 
goods form single-good indexes. 


A. X o SCH. Co, (A.10) 
pat TE 
iepa 


Since the first term on the right hand side is a constant, to maximize A... the sum, 8,+8,, should be mini- 
mized; that is, two goods with the smallest algebraic values of 8, should be clubbed together into a single 
index. Since 8, and 8,-., are the two smallest values by the above ordering, 


Ai Es- Biia ({A.11) 


j=) 
When k=n—2, there are two possible ways of partitioning the above (n—1) indexes into {n—2) indexes: 


{1} Combine a third good, indexed r, into the above two-good index which contains goods n and (n—1). 
Let its accuracy measure be Alza. 

(2) Combine other two single-good indexes for goods indexed s and t, into one. Let its accuracy measure 
be AC, 


Then we can show: Ai, AÙ, 


2 "CH ` 2(8.+8,-1+8,-) 
mis 2 D we ee cf EE 


(A.12) 


jeaia-~ ir 
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Since 6,-2 ls the third smallest, Al, , is maximized by setting r=n—2: 


A EEN (A.13) 


Ia? d 
For the second method of constructing (n—2) indexes from n goods, 
A= D p+ See (A.14) 
e 
A2, , is maximized by setting s=n—2 and t=n—3. Therefore, 
, i 
Als >, B,+ B, ME , {A.15) 


E 2 
fen.n—-lLa-Za-3 
In order to determine which of these two methods vee a better (n—2)}-index system, we calculate their dif- 
ference: 


Pee | 2 
Ara~Ari=— (B+ Bes + Beat Bra) = (Bat Bret +8,-2) 


ll Bt 


20, 


since 8,> 8,4. for all js. Therefore, the good with the third smallest 8, should be combined with the two-good 
index in A3, in order to construct the most efficient (n —2}-index valuation rule. 

If the best (n—2}index valuation is obtained by combining the three goods with the three smallest 
algebraic values of 8, into a single index, it can similarly be shown that the best (n—3)}-index, valuation 
requires combining the four goods with the four smallest values of 8,3. Thus a mathematical EE proves 
Theorem 3. Equation (13) in the theorem cree from (A.8) and (A.13). 





Q.E.D. 
Proof of Theorem 3. 
From Theorem 2, the optimum index configurations for k=m, m—1, and m+1 are as follows: 
At= Ye ee (A.16) 
-1 jag Q— ry 
== T 
Ze E n— SE 
and 
B; 
an= a4 > ide a 
al jen+i NM 
Now for the convexity, we will show: 
H(m)-H{m—-1)sH(m+1)—-H(m) 
or 
H(m)-—{H(m—1)+H(m+1)}s0, (A.17) 


From expressions (A.8) and (A.16), left hand side of expression (A.17): 


SR (Zee d Ben an Eee EA m+2 +E} i 
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1 : 1 R y 
=m" re a EE 


` +(n—m)(n—m+1)(Be-i+Ba)i]- 


Since A 8,s(n-m)}ß;, 


jam+i 


left hand side 


i , 
sl- Bami) t n mu a pa p g n lite lte, 


A e 3] 


a GM Est SC 
+2 


s0, . 


which is true because 8,58, and (n~ m+2)=1 for all ms. This proves the convexity of the efficton fron- 
tier. ; 
Q.E.D. 


Proof of Corollary 3.1. 
Since H (k) is convex in k, for Ge minim to be attained at k= k, it 1s necessary and sufficient that 
H{(k* —1)=H(k*) and H(k*)sH(k* +1), 


Conditions of Corollary 4.1 are derived directly by substitution from expression (13) in the above inequalities. 
_ QE,D. 


Proof of Theorem. 4. 
DI From expression {A.7)}, MSE of the general price level valuation is: 


Mee, Ju äere ler-stfaseit, (A18) 


From equations (4) and (A.18), 
MSE(R,,1)—MSE(Ro 1) =’ LA Cal, 
For the historical cost valuation to dominate the general price level valuation, this difference must be positive. 


up From equations (4) and (6), the difference between MSE of the historical-cost valuation and that of the 
current en is given as in Theorem 4. l 
QED. 


E 


Apostol, T. 1967. Calculus. Vol. 11. ‘Waltham, MA: Blaisdell. ` ` 

Beaver, W. H., A. Christie, and Paul A. Griffin. 1980. The information content of SEC 
Accounting Series Release No. 190. Journal of Accounting & Economics 2 (August); 
127-57, 

, and J. Demski, 1979, The nature of income measurement. The Accounting Review 54 
(January): 38—46. 

—, P. À. Griffin, and W. R. Landsman. 1982. The incremental information content of . 
replacement cost earnings. Journal of Accounting and Economics 4 (January): 15-39, 

Bureau of Labor Statistics. 1976, Handbook of Methods, Bulletin 1910, Washington, D,C.,Gov- . 

ernment Printing Office. 

Chambers, R. J. 1966. Accounting, Evaluation and Economic Behavior, Englewood Cliffs, NJ; 
Prentice-Hall, - 





692 | The Accounting Review, October 1991 


Early, J. 1978. Improving the measurement of producer price changes. Monthly Labor Review 
101 (April): 7-15. 

. 1979. The producer price index revision: Overview and pilot survey results. Monthly 
Labor Review 102 (December): 11-19. ; 

Ecker, S., and R. Wilson. 1974. On the theory of the firm in an economy with incomplete mar- 
kets. Bell Journal of Economics and Management Science (Spring): 171-80. 

Edwards, E. O., and P. W. Bell. 1961. The Theory and Measurement of Business Income. 

. Berkeley: University of California Press. 

Feller, William. 1968. An Introduction to Probability Theory and Its Applications. Vol. I, 3d ed. 
New York: Wiley. 

Financial Accounting Standards Board. 1979. Statement of Financial Accounting Standards 
No. 33: Financial Reporting and Changing Prices. Stamford, CT: FASB. 

Gheyara, K., and J. Boatsman. 1980, Market reaction to the 1976 replacement cost disclosures. 
Journal of Accounting and Economics 2 (August): 107-25. 

Hall, T. W. 1982. An empirical test of the effect of asset aggregation on valuation accuracy. 
Journal of Accounting Research 20 (Spring): 139-51. : 

, and Keith A. Shriver. 1990. Econometric properties of asset valuation rules under price 
movement and measurement errors: An empirical test. The Accounting Review 65 (July): 
537~-56. 

Ijiri, Y. 1967. The Foundations of Accounting Measurement: A Mathematical, Economic and 
Behavioral Inquiry. Englewood Cliffs, NJ: Prentice-Hall. 

. 1968. The linear aggregation coefficient as the dual of the linear correlation coefficient. 
Econometrica 36 (April): 252-59. ) 

———., and J. Noel. 1984. A reliability comparison of the measurement of wealth, income, and 
force. The Accounting Review 59 (January): 52-63. D 

Lev, Baruch. 1969. Accounting and Information Theory. Studies in Accounting Research #2. 
Sarasota, FL: American Accounting Association. l 

Lim, S. S. 1990. Accuracy of partial valuation rules and optimal use of price indexes. Ph. D. 
dissertation. University of Minnesota, Minneapolis. 

, and S. Sunder. 1990.'Accuracy of linear valuation rules in industry segmented environ- 
ments: Industry vs. economy weighted indexes. Journal of Accounting and Economics 13 
(July): 167-88. 

The Price Statistics Review Committee of National Bureau of Economic Research. 1961. The 
Price Statistics of the Federal Government, General Series No. 73. 

Radner, R. 1974. A note on unanimity of stockholders’ preferences among alternative pro- 
duction plans: A reformulation of the Eckern-Wilson model. Bell Journal of Economics 
and Dee ee Science 5 (Spring): 181-84. . 

Ro, B. 1980. The adjustment of security prices to the disclosure of replacement cost accounting 
information. Journal of Accounting and Economics 2 (August): 159-89. 

_ Shih, S., and S. Sunder. 1987. Design and tests of an efficient search algorithm for accurate 
linear valuation system. Contemporary Accounting Research 4 (Fall): 16-31. 

Shriver, K. 1986. Further evidence on the marginal gains in accuracy of alternative levels of 
specificity of the producer price indexes. Journal of Accounting Research 24 (Spring): 
151-65. 

. 1987. An empirical examination of the potential measurement error in current cost 
data. The Accounting Review 62 (January): 79—96. 

Ne R. 1970. Theory of the Measurement of Enterprise Income. Lawrence: University Press 
of Kansas. 

Sunder, DG. 1976. Properties of accounting numbers under full costing and successful-efforts 
costing in the petroleum industry. The Accounting Review 51 (January): 1-18. 

. 1978. Accuracy of exchange valuation rules. Journal of Accounting Research 16 

(Autumn): 341-867. 

, and G. Waymire. 1983. Marginal gains in accuracy of valuation from increasingly 

specific price indexes: Empirical evidence for the U.S. Economy. Journal of Accounting 

Research 21 (Autumn): 565-80. 























Lim and Sunder—Efficiency of Asset Valuation Rules 693 








, and . 1984. Accuracy of exchange valuation rules: Additivity and unbiased esti- 
mation. Journal of Accounting Research 22 (Spring): 396—405. 

Swanson, E. 1990. Relative measurement errors in valuing plant and equipment under current 
cost and replacement cost. The Accounting Review 65 (October): 911-24. 

, and K. A. Shriver. 1987. The accounting for changing prices experiment: A valid test of 
usefulness? Accounting Horizons 1 (September): 69-78. 

Tippett, M. 1987. Exchange valuation rules: Optimal use of specific price indexes in asset 
valuation. Accounting and Business Research 17 (Spring): 141-54. 

Triplett, J. 1975. The measurement of inflation: A survey of research on the accuracy of price 
indexes. P. Earl (ed.). Analysis of Inflation. Lexington, MA: Lexington Books. 19-82. 

Tritschler, C. 1969. Statistical criteria for asset valuation by specific price index. The 
Accounting Review 44 (January): 99-123. 





THE ACCOUNTING REVIEW 
Vol. 66, No. 4 
October 1991 


"pp. 694+-717 


A Laboratory Market Examination 

of the Consumer Price Response to 

Information about Producers’ Costs 
and Profits 


Steven J. Kachelmeier 
Stephen T. Limberg 
Michael S. Schadewald 
The University of Texas at Austin 


SYNOPSIS AND INTRODUCTION: Using laboratory market data, this- 
study demonstrates that consumers respond differently to a market event 
depending on the information reported about the event. Specifically, they 
respond more rapidly to an economically predicted price increase when 
they are informed that sellers’ marginal costs have increased, but they 
resist price increases if they know that the sellers’ profits have increased. 

These information effects are based on the principle of dual entitlements, 

which posits that purchase decisions are influenced not only by the direct 
economic utility of the purchase, but also by consumers’ perceptions of the 
equity or fairness of a negotiated price. Survey evidence from prior studies 
indicates that consumers (buyers) justify price increases driven by increases 
in sellers’ costs, but resist price Increases that increase sellers’ profits. This 
study: goes beyond surveys to investigate these predictions in a market 
setting affected by an economic event that simultaneously increases both 

the marginal costs incurred and the profits earned by sellers. Specifically, 

we examine the combined effect of a change In the sellers’ tax rate and tax 


Nine laboratory markets in three separate financial information 
structures were conducted to investigate the predicted information effects. 
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Each market had ten traders (five buyers and five sellers), for a total of 90 
subjects.. In three markets, buyers were apprised-of an increase in sellers’ 
marginal tax costs. In three other markets, buyers were informed of an 
increase in sellers’ after-tax profits. Finally, three control markets with no - 
information disclosures served as a baseline. Subjects in each market were 
student volunteers who received their market profits in real cash, in con- 
formance with the tenets of induced-value theory. ` 
The results have implications for the financial: disclosures volunteered ` 

by firms or mandated by regulatory bodies. While the accounting literature - 
has traditionally stressed information effects on investors (Lev 1989), the 

body of users affected by financial reporting Is much -larger and includes 
the consumers who purchase the goods and services of disclosing firms 
(Financial Accounting Standards Board 1978, par. 24). This study sug- 
gests that financial. disclosures can influence consumer behavior in com- 
petitive markets for goods and services. 


Key Words: /nformation disclosure, Laboratory markets, Fairness. 


Data Availability: The data will be made available to interested readers for | 
| `- research purposes. 


HE remainder of this article is organized in four sections. First, we introduce a 

theory from the economics literature to formulate hypotheses about the informa- 

tion effects of interest. Second, we present the design and implementation of nine 
laboratory markets used to test the hypotheses. We discuss results in the third section, 
. then present connie observations and directions for. further study in the fourth 86c- 
tion. 


I. Theory: and Hypotheses 
The Principle of Dual Entitlements 


In related articles, Kahneman et al. (1986a) sad Thaler (1985) propose a EE) 
of dual entitlements” to predict how price reactions are guided by perceptions of the 
equity or fairness of market prices in addition to economic incentives. Underlying this 
theory is the notion that consumers derive both “acquistion utility” and ‘transaction 
utility” from market purchases. Acquisition utility captures a consumer’s willingness 
to purchase a good on the basis of economic incentives as traditionally modeled. By 
contrast, transaction utility captures any incremental willingness or hesitance to buy a 
good depending on whether the proposed price is justifiable in the context of the con- 
sumer’s perception of fairness. For example, a consumer may have some fixed acquisi- 
tion utility for a gallon of gasoline, depending upon his or her transportation alter- 
natives and desire to drive. However, this acquisition utility may overstate the 
consumer's actual demand for gasoline if the consumer senses that price hikes at 
the pump reflect unjustified profit taking in reponse to, for example, last year’s Iraqi 
invasion of Kuwait. Conversely, if the same consumer is apprised of actual increases in 
the dealer’s wholesale gasoline costs, price increases may be more palatable. ` 
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Kahneman et al. (1986a) support their predictions using data from telephone 
surveys involving a variety of price change scenarios. Their evidence indicates that in- 
' dividuals accept price increases that are justified by increases in sellers’ costs, but 
resist price increases that also increase sellers’ profits. The latter finding indicates that 
considerations of fairness are not necessarily inconsistent with self-interested behavior 
(also see Isaac et al. 1991). Buyers who are informed of an increase in sellers’ profits are 
not expected to respond with altruism or charity. To the contrary, if these buyers 
believe they are being unfairly exploited by the sellers, they will react in their self- 
interest by withholding demand, thereby, hindering a price increase. _ 

It is unknown, however, whether these attitudes translate to actual decisions in a 
market environment. For example, Coursey et al. (1987) and Knez et al. (1985) provide 
evidence that behavior elicited from surveys can differ substantially from behavior 
observed in laboratory markets with induced economic incentives. In studies involving - 
two-person “ultimatum games” (see, e.g., Kahneman et al. 1986b; Ochs and Roth 1989; 
Thaler 1988), researchers have demonstrated that fairness- effects can be present even 
when subject decisions involve actual cash payoffs.’ However, Forsythe et al. (1988) 
suggest that propensities to act fairly in ultimatum games are not as pronounced when 
there are no potential retributions for unfair behavior {as in a “dictator” game).? Fur- 
ther, neither ultimatum nor dictator games directly address the price predictions of the 
principle of dual entitlements. Most importantly, two-person games and telephone sur- 
veys lack the structure of an actual market setting. 


The Advantages of a Tax Setting for Testing the Theory 


It is implicit in the predictions of the principle of dual entitlements that consumers 
rely on an information structure to form their assessments regarding costs and profits. 
That information structure is of direct interest in this study. We wanted a single market 
event that would increase both sellers’ marginal costs and their profits. Such an event 
would enable us to use alternative information disclosures to isolate the theoretical pre- 
dictions. The change in the taxation of sellers’ profits from a pure income tax to a 
consumption (sales) tax was the economic event we chose. Given that the change is tax 
neutral (the same aggregate taxes are collected when both regimes are in equilibrium) 
and assuming nonextreme elasticities of demand and supply, this change can result in 
both increased seller marginal costs and increased seller profits. To illustrate, consider 
the effect of imposing a 50 percent pure income tax on sellers in a market that previ- 
ously had no tax. (We use the term “pure” income tax to imply that all costs are deducti- 
ble in computing taxable income, including opportunity costs.) Assuming an upward- 
sloping marginal cost (supply) schedule, the 50 percent income tax extracts half of the 
producer surplus on each sale. However, at the competitive margin of zero profit, tax is 
also zero. Thus, a seller’s marginal incentive to trade is unaffected by the income tax. 
That is, the after-tax supply curve is identical to the pretax supply curve. A pure in- 
come tax has the effect of extracting only “windfall” producer surplus on premarginal 
trades. 


' In an ultimatum game, one subject (the allocator) proposes a division of a prize (say, $1.00) with the other 
player (the recipient). The recipient then has the option of either accepting or rejecting the proposal. If rejected, 
neither party receives a payoff. 

H 2 A dictator game is identical to an ultimatum game except the recipient does not have the option to reject 
e pro 
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Now replace the income tax with a 20 percent sales tax on sellers’ revenues. Unlike ` 
the income tax, the new sales tax shifts the after-tax supply curve. This is because, at 
any pre-tax marginal cost MC(q) at quantity q, a seller must now recover a price of at 
least MC(q)/(1—1) to break even, where 7 is the sales tax rate. Thus, all marginal costs 
increase by the amount of 7MC(q)/(1—7), or 0.25 MC(q) for 7=0.2, shifting the supply 
curve upward. At the same time, the percentage of total surplus available to sellers in 
economic equilibrium is higher than: that available under the income tax regime be- — 
cause of lower taxes on premarginal trades. 

- In sum, the tax scenario described: above creates Dese costs ges increasing . 
. profits. Therefore, we are able to predict opposing price forces in response to cost. ver- 
sus profit information without introducing deception De, all disclosures are truthful).* 


Hypotheses 


The change from an income tax to a sales tax has the effect of shifting the supply 
curve upward, hence increasing the competitive price prediction. We examine how the 
speed and extent of convergence to the new price prediction might be influenced by 
specific disclosures about sellers’ new costs and profits. We propose to test the follow- 
ing hypothesis based on the principle of dual entitlements: 


H1: The price response to a change from a sellers’ income tax to a sellers’ sales tax 
will be greater when buyers are informed of the effect of the tax on sellers’ 
costs than when buyers are informed of the effect on sellers’ profits. 


In addition, we ask the more stririgent question of whether responses to either of 
these disclosure structures will be different from a baseline structure with no buyer dis- - 
closure of either costs or profits: 


H2:. Cost information will hasten price convergence relative to the convergence ob- 
- served in markets with no disclosures. 

H3: Profit information will depress price convergence relative to the convergence 
observed in markets with no disclosures. 


A final hypothesis examines the implications of the price hypotheses on market 
volume and efficiency. Although the principle. of dual entitlements as originally formu- 
lated did not consider these effects, it is interesting to consider what it means for buyers 
to “resist” price increases due to perceptions of inequitable profits earned by sellers. If 
this resistance depresses market prices below a new competitive equilibrium, then 
buyers are forming a tacit cartel, intentionally forgoing profitable purchases to keep 
prices down. This, in turn, implies lower volume and lower efficiency in markets where 
predicted price increases are resisted.* ‘We examine these implications as a fourth . 
experimental hypothesis: 


3 The tax setting was advantageous because it allowed us to separate and test the information predictions of 
the principle of dual entitlements. We make no claims about tax policy implications since we have no model of 
optimal tax policy for forming policy assessments. 

4 We use the term “efficiency” to denote the degree to which laboratory traders successfully extract the 
amount of market. surplus theoretically attainable under the induced demand and supply - structure. 
Experimental economists often refer to this metric as “allocative efficiency,” ” but allocation issues are irrele- 
vant in this experiment because there are no competing goods. 
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H4: Profit information will lead to lower volume and lower GEN efficiency than 
-dn either the cost-disclosure or EE markets. 
I. Method 
Experimental Design and Subjects ` 


‘We conducted nine laboratory: markets with ten we ‘each: Subjects were 
recruited on a volunteer basis from senior undergraduate and graduate accounting 


classes.. As explained below, subjects earned actual monetary profits from their trades, - 


with total payoffs ranging from $20 to $46 per subject. The average compensation was 
$34 for an experimental session that lasted three hours. 


In three of the markets, buyers were informed about the effects of the tax changes on 


sellers’ marginal costs. In three other markets, buyers were informed about the effects 


of the tax change on sellers’ after-tax profits. Finally, three control markets with no ` 


_ disclosure to buyers : about the change in sellers’ taxes served as a baseline. The sequen- 
tial order of markets was randomized to minimize the threat of contaminating carry- 
over effects. The procedures used in DEE the various. disclosure structures are 

discussed below. | 


Experimental Materials and Procedure 


The markets were conducted in a behavioral SE laboratory with two sound: 
proof rooms, one of which contained networked computer terminals programmed for 
-. market trading. We began each experimental session by distributing and reading the in- 

structions shown in the appendix. These instructions were common to all treatment 
conditions and communicated the. structure of the laboratory market, the computation 
of pretax profits, and the 50 percent income tax on sellers’ profits.* Examples in the 
instructions used extreme values-(about ten times the magnitude of actual redemption 
values and costs) to avoid price anchoring.. 

_ . After the instructions were .read, the ten subjects drew cards that taidoa as- 
signed them to one of five buyer positions.or one of five seller positions. Subjects were 
- then led into the adjoining room with the computer terminals and were seated at the 
appropriate computer stations. These stations were partitioned such that subjects could 
not see or otherwise communicate with each other during trading. Once seated, sub- 
jects were given a prepared text about the computer trading. mechanism. We used the 


Multiple Unit Double Auction program written by Johnson et al, (1988). Subjects then . 


began a 15-minute practice trading period with nonsense prices to familiarize them- 

selves with the computer. ` 

= After the practice session, the actual redemption value and cost schedules were dis- 
tributed to buyers and sellers, respectively. Each buyer had a different downward- 


sloping redemption value schedule, and profited from the excess of the redemption ` 


value over the negotiated price for a given purchase. Each seller had a different up- 
ward-sloping cost schedule, and profited from the excess of the negotiated emm over 
this cost and a 50 Gees income tax on the pretax profit. | 
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Figure 1- 
Supply a and Demand Functions Under Different Tax Regimes 
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The aggregation of the five redemption values and five cost schedules results in an 
induced market demand and supply (see Smith 1976), as shown in figure 1. Although 
the lower supply curve in figure 1 is labeled as the supply curve under an income tax, it 
is identical to the implied supply curve with no tax. The income tax simply has the ef- 
fect of extracting half of the producer surplus on premarginal trades. Other than this 


DH 
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income tax extraction of sellers’ pre-tax profits, buyers and sellers kept the actual 
monetary surpluses indicated in figure 1. 

Five units were scheduled on each buyer’s redemption value sheet and on each 
seller’s cost sheet, which aggregates to the 25 units scheduled in figure 1. The intersec- 
tion of demand and supply results in an equilibrium volume prediction of 16 units at a 
competitive equilibrium price of $2.50. The fifteenth and sixteenth units occur at the | 
zero profit economic margin. However, past laboratory market evidence (see, e.g., Plott 
and Smith 1978) indicates that subjects perceive small subjective transaction costs, and 
hence are unwilling to trade for a zero profit. A common laboratory market approach to 
address this problem is to offer a small fixed commission for each trade. We followed 
this technique, informing buyers and sellers that a five-cent commission would be 
added for each completed trade. 

The trading rule allowed only buyers to propose prices. The standing bid was dis- 
played on each trader’s screen, identifying the buyer with the current highest bid price. 
Another buyer could replace the standing bid by entering a higher bid at any time. 
Sellers accepted these bids at their discretion. An accepted bid registered as a sale on 
both the appropriate buyer’s and seller’s screens, which also updated quantities on 
hand. Once a bid was accepted, the standing bid reinitialized to zero, thereby allowing 
further proposals to be made for the time remaining in the period. 

This bid-only mechanism enhanced our ability to assess the predictions of the 
principle of dual entitlements in a market setting. Since the theory models price re- 
sponses to perceptions of fairness on the part of buyers, a bid-only mechanism serves to 
isolate these responses without contamination by sellers’ counteroffers. As such, the 
bid-only mechanism is an example of “intentional experimental artificiality” (Swier- 
inga and Weick 1982), which is used to increase precision in an experimental setting. 

Subjects traded under the income tax regime for ten laboratory market periods (i.e., 
the demand and supply structure shown in figure 1 was replicated ten consecutive 
times within each market). With the exception of a four-minute limit for the first two 
periods, each market period was limited to three minutes. 


Tax Change and Treatment Manipulations 


After the tenth laboratory market period, the experimenter announced a brief 
break. The ostensible purpose for this break was to give subjects a chance to stretch and 
enjoy refreshments. Subjects were led into the adjacent room for this purpose, and were 
strictly enjoined against talking about the experiment. Three research assistants as well 
as the researchers monitored all subjects closely at all times to maintain control. There 
were no breaches of the rule to refrain from talking about the experiment. 

Five minutes into the break, one of the researchers announced that we would pay 
subjects for their profits during the first ten periods to show our good faith. The re- 
searcher explained that one assistant would process payments for the buyers in the 
computer room while another assistant would process payments for the sellers in the 
adjacent room. Given our assurances to subjects that their individual payoffs would be 
kept private, this was a reasonable procedure to undertake. 


$ The underlying rationale is that the small fixed commission offsets a subject’s subjective transaction 
cost in such a way that an economic point prediction at the competitive margin is still possible. 
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Thus, the five buyers were led back into the computer room where they received 
their payments for the first ten periods. (This was done discreetly and separately for 
each individual.) The same was done for the sellers in the adjoining room. In addition, 
sellers were informed that starting with period 11, the sellers’ tax would change from a 
50 percent tax on profits to a 20 percent tax on revenues. As explained in the preceding 
section, this tax change has the effect of dividing the marginal cost schedule by 0.8, thus 
shifting the supply curve upward. This is also shown in figure 1, which indicates the in- 
crease in the competitive price prediction from $2.50 to $2.90. 

Sellers were then led back into the computer room. At this point (with all subjects 
reconvened), the information treatment was manipulated. In the three marginal cost- 
disclosure markets, buyers and sellers read a script that included the following: 


Starting with the next period, sellers will no longer pay a tax equal to 50% of their pre-tax 
profit: Rather, sellers will pay a tax equal to 20% of the sales price of each trade. Notice 
that the new tax is a direct reduction of the sellers’ gross revenue, as opposed to a reduc- 
tion of net profits. The effect of this change is to increase the price that a seller must re- 
ceive to cover his or her costs. 


The purpose of this script was to emphasize that the new tax regime had the effect of in- 
creasing the sellers’ break-even price (after-tax marginal cost) for any pretax cost. An 
example was worked (with a nonsensical cost to avoid anchoring) to illustrate this 
effect. 

In the three profit-disclosure markets, our objective was to communicate the effect 
of the tax change on sellers’ profits. To accomplish this, profit-sharing graphs were dis- 
tributed at the beginning of each period. The graphs for the income tax regime (periods — 
1-10) and the sales tax regime (periods 11-20) are reproduced together in figure 2.’ 
These graphs show the percentage of total available after-tax profits received by the 
sellers at any market price.®. Implicit in these percentages were the buyers’ relative 
profits, which subjects could easily compute as 100 percent minus the sellers’ per- 
centages. 

Figure 2 indicates that sellers extract 50 percent of the total after-tax surplus at the 
competitive equilibrium price of $2.50 under the income tax regime. This surplus in- 
creases to 85 percent of the total after-tax surplus at the competitive equilibrium price 
of $2.90 under the sales tax regime. Thus, at competitive equilibrium, sellers capture a 
much larger share of the total available surplus with a sales tax. Further, this same con- 
clusion holds for any of the prices observed in either the income tax or sales tax re- 
gimes.? This point was emphasized in the following script that was read to all subjects 
after the break in the profit-disclosure markets: 


_ . The different treatments were not technically equivalent during the first ten {income tax) perlods because 
graphs were used in the profit-disclosure condition. However, we do not believe that the SA confounded . 
the assumption of an equal baseline between treatments prior to the tax regime change because the competitive 
price ($2.50) under the income tax regime also resulted in an equal profit division between buyers and sellers, 
Data from the income tax periods support this conclusion; there were no discernible differences in market 
prices immediately prior to the break between periods 10 and 11. 

® In constructing the profit-sharing graphs, we assumed that traders would not willingly trade at a loss. 
Thus, for off-equilibrium prices, we calculated profits only for the most profitable units at the largest volume 
that would allow only profitable (or zero profit) trades to take place. 

* The sales tax results in higher sellers’ profit percentages for any price in excess of $1.00, A $1.00 price is 
well below the lowest prices actually observed.in any of the market periods. 
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Figure 2 | : 
Profit Information Disclosure 
Income Tax and Sales Tax Regimes 
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Note: In this figure, the income tax profit-sharing graph and sales tax profit-sharing graph are presented 
together for expository purposes only. Subjects received separate income tax and sales tax graphs that 


were not overlaid. In addition, the income tax equilibrium (Eme) and sales tax equilibrium (Ess) were 
not displayed in the subjects’ materials. 


Starting with the next period, sellers will no longer pay a tax equal to 50% of their pre-tax 
profit. Rather, sellers will pay a tax equal to 20% of the sales price of each trade. The 
change in the seller tax changes how the total profits available to sellers and buyers each 
period are shared for any given market price. What has essentially happened is that the 
new tax is less burdensome than the old tax, and thus the sellers’ share of the profits is 


greater at any given price. 


To ensure that subjects attended to the profit-sharing information, they were asked 
to mark their expected trading price and the related seller profit percentage immedi- 
ately before each period. At the end of each period, subjects were also asked to mark the 
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- actual price and sellers’ profit percentage for their last trade. This information was col- 
lected each period. 

In the three markets with no buyer disclosures, post-break periods began immedi- 
ately after sellers were reconvened in the computer room. Buyers were not apprised of 
any information about the new tax. 

In all treatment conditions, ten additional periods under the new tax regime were 
conducted after the break and any information disclosure. At no time were subjects in- 
formed of the number of remaining periods. After the twentieth period, the market was 
terminated, a post-experimental questionnaire was administered (discussed later), sub- 
jects were paid for their profits during the final ten periods, and they were dismissed 
after a request that they not discuss any aspect of the experiment. 


| III. Results 
Price Effects—Primary Findings 


Figure 3 is a plot of the average prices across markets by treatment condition for 
each of the 20 market periods. This same information is tabulated by individual market 
in table 1 for period 10 (the last of the baseline periods under the income tax regime) 
and periods 11-20, which varied among the three information structures. 

These descriptive data reveal two important price effects, both of which are sup- 
ported by the various statistical tests reported below. First, prices do not appreciably 
differ immediately preceding the introduction of the new tax regime in period 11. The 
test of information effects associated with the new tax is predicated on the assumption 
of an equal baseline immediately prior to the new tax, so this apparent pre-treatment 
equivalence between markets in period 10 is encouraging. Second, in the periods fol- 
lowing the tax change, prices tend to spread in the predicted directions between the 
three information structures. Specifically, prices are the highest in the markets with 
marginal cost disclosure (fully converging to the new competitive price). The lowest 
prices are observed in the markets with profit disclosure. Control markets with no dis- 
closures fall between these treatments. 

In table 2, the price differences noted above are tested for statistical significance by 
market period. Because these period-by-period mean difference tests violate the inde- 
pendence assumption, we later conduct a series of alternative statistical tests to cor- 
roborate the primary results. Further, the period-by-period tests aggregate prices across 
the three markets in each treatment condition, thus obscuring possible differences be- 
tween market replications. As a result, we report the dispersion of individual trans- 
action prices by period and by individual market in figures 4—6. In figure 4, the inter- 
quartile price range is plotted by period for each of the three marginal cost-disclosure 
markets. The same data is reported for each pront eee market in figure 5 and for 
each no-disclosure market in figure 6. 

For hypothesis H1, table 2 shows that prices indies the sales tax regime increased 
more rapidly in the marginal cost-disclosure condition than in the profit-disclosure 
condition, as predicted. The difference is significant at the œ =0.05 level or better in all 
treatment periods except periods 12 and 13, and the difference widens in the later peri- 
ods. We therefore conclude that the data support hypothesis H1. A closer look at the 
individual market data in table 1 indicates, however, that this support is driven by two 
of the three profit-disclosure markets (PR2 and PR3). Conversely, profit-disclosure 
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Table 2 
Average Differences in Market Prices by Period* 





Comparisons 
Marginal Cost Marginal Cost Profit 
— Profit — No Disclosure — No Disclosure 
Period 10 (Last 
period under income 
tax regime) $— 0.003 0.010 0.013 
(—0.22) (0.68) (1.30) 
Period 11 (First 
period under sales 
tax regime) 0.106 0.054 — 0.052 
(4.25) (2.41) (—3.22}° 
Period 12 0.022 0.044 0.022 
(0.95) (1.78) (0.97) 
Period 13 0.029 0.042 0.013 
(1.06) (1.56) (0.48) 
Period 14 0.052 0.044 — 0.008 
(1.98)? (1.69) (—0.23) 
Period 15 0.056 0.029 —0.027 
D.h (1.29) (—0.84) 
Period 16 - 0.093 0.036 ~~ 0.057 
(3.67)" (2.29) (—2.01)° 
Period 17 0.102 0.027 — 0.075 
(4.47)* (1.89) (—2.95}* 
Period 18 0.102 0.034 — 0.088 
(5. Gi (3.22)° (— 3.42) 
Period 19 0.117 0.020 — 0.097 
(3.66}* (2.727 (—2.96}* 
Period 20 0.096 0.013 — 0.083 
: (3.36)* (1.42) (~2.77}* 


* t-statistics are in parentheses. 
*“p<0.01 (two-tailed test). 
t p<0.05 (two-tailed test). 


market PR1 shows a price pattern similar to the marginal cost markets, in that it con- 
verges to the competitive prediction. 

Results for hypotheses H2 and H3 are less pronounced, but are still significant in 
later trading periods. Thus, there is some evidence in support of both hypothesis H2 
{marginal cost-disclosure prices above the no-disclosure control) and hypothesis H3 
(profit-disclosure prices below the no-disclosure control). With the exception of periods 
12 and 13, the average price in the three no-disclosure control markets fell between the 
two information treatment structures after the switch to a sales tax. 


Corroborating Tests 


In this subsection we assess the robustness of the primary conclusions. For the sake 
of brevity, we concentrate on hypothesis H1. A conservative but statistically valid ap- 
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Figure 4 
Mean Prices and Interquartile Ranges for the Three Marginal Cost-Disclosure Markets 
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proach is to average prices by market, then use these averages as independent statis- 
tical observations. For our design, this test would suffer from very low statistical power 
(four degrees of freedom in the t-test comparison between two groups of three markets 
each). Yet, the difference is marginally significant (t4y=1.47; p=0.11) in a one-tailed 
test averaging prices by market across the last five periods (16-20). The difference is 
significant when the one anomalous profit-disclosure market (PR1} is excluded 
(Gass 19.06; p<0.01). 

Another statistical alternative is to assume that the introduction of a new tax 
regime after period 10 had the effect of perturbing the time series of prices for all 20 
periods. Did the markets with different information structures differentially respond to 
this time-series perturbation? To answer this question, we fit a time-series model to the 
laboratory market data. Since we do not have enough temporal observations for a 
precise model estimation, we assume that our experimental data fits a linear integrated 
moving average process, or ARIMA (0,1,1). This approach allows both linear trend and 
moving average serial correlation incremental to trend.’ Glass et al. (1975) indicate that 
an ARIMA (0,1,1) model is common for experimental data with a time-series perturba- 
tion. The ARIMA (0,1,1) model for a differential reaction to SS tax regime change is as 
follows: 


t-1 
PRICEDIF,=a+8TAX,+(1-9) $ n+% 


Je 1 
where: 


PRICEDIF, is the difference between the average prices in the marginal cost-disclo- 
sure markets and profit-disclosure markets for period t, t=1,...,20 
(one observation for each period), 
- TAX, is set equal to 0 for the income tax periods (t=1,...,10), and 1 for the 
sales tax periods (t=11,...,20), 
a is the intercept term, | 
8 is the coefficient on TAX,, measuring the differential impact of the tax 
change between the two information structures, 
a is the residual term, assumed independent and identically normally dis- 
tributed, and 
© is the ARIMA (0,1,1) parameter, capturing both linear trend and first- 
order moving average serial correlation. 


An iterative maximum likelihood regression approach is used to estimate o and 8, 
where the data matrix is first transformed to remove trend and serial correlation and 
the parameter © is then chosen to minimize the sum of squared residual error across 
the admissible range of O from —1 to +1.'! The objective of this model is to estimate 
any shift in the level of price reactions to the new tax due to the different information 
structures, after controlling for differences in trend and serial correlation. The estima- 


‘© Practical alternatives to an ARIMA (0,1,1) model include a simple first-order autoregressive, or ARIMA 
(1,0,0), model and a quadratic integrated moving average, or ARIMA ({0,2,1), model. We also fitted the data to 
these alternative models, attaining results similar to those reported. 

1 When 6=0, the model is a simple first-differencing process for linear trend. When © =1, the model re- 
duces to ordinary least squares regression. Other values for © introduce both trend and moving average serial 
correlation. For details on the estimation process, see Glass et al. (1975, 134—40). 
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tion shows that the best fit occurs at O =0.5, where the TAX, coefficient 8 is estimated 
at 0.09 (tis4= 3.38; p<0.01). Thus, the time-series analysis supports the same conclu- 


sion as that of the period-by-period mean-difference tests: marginal cost-disclosures re- 
sulted in significantly higher prices than profit disclosures.” 


Volume and Efficiency 


Table 3 provides data on volume and efficiency in each of the nine markets to 
address hypothesis H4. Efficiency is defined as the proportion of total feasible market 
surplus (buyers’ profit plus sellers’ profit before taxes) actually achieved by the market 
during each period.“ As indicated in table 3, after the shift to a sales tax, both volume 
and efficiency were generally lower in the profit-disclosure markets than in either the 
marginal cost or the no-disclosure markets. ` ` 

For volume, the average quantity traded each period for the last five periods was 
12.9 in the marginal cost markets, 11.8 in the profit-disclosure markets, and 12.1 in the 
no-disclosure control markets. The equilibrium quantity prediction was 13. These data 
suggest that buyers in the profit-disclosure markets were successful in withholding 
about one unit each period from aggregate demand. This behavior had the effect of re- 
ducing average efficiency for the last five periods from 99.6 percent in the marginal ` 
cost markets to 95.6 percent in the profit-disclosure markets. Although this difference 
does not appear large in an absolute sense, it should be noted that 13 of the 15 effi- 
-~ ciencies computed for the last five periods of the marginal cost-disclosure markets were 
at 100 percent of the total attainable surplus. This was true for only two of the 15 obser- 
vations in the profit-disclosure markets. Thus, on balance, there is support for both the 
volume and the efficiency predictions of hypothesis H4. 


Construct Validation 


We interpret the results reported above as providing support for the principle of 
dual entitlements. The validity of this interpretation hinges on whether buyers in the 
different information structures differentially assess “fair” pricing in the context of 
their particular market. We administered a post-experimental questionnaire after 
period 20 that asked traders in each market to indicate their perceptions of the fair (not 
necessarily the actual) price in periods 1-10 (income tax) and periods 11-20 (sales tax).'* 
Averages for the buyers’ responses are reported in table 4. 

Statistical tests on the data summarized in table 4 indicate that subjects did not per- 
ceive differences in fair prices prior to the change in tax regime. However, under the 
new tax, buyers’ assessments of fair prices for periods 11-20 diverged among the 
three information structures. That is, perceived fair prices in the profit-disclosure mar- 
kets were systematically lower than those in either the marginal cost-disclosure mar- 
kets (tesy=—4.18; p<0.01) or in the no-disclosure control markets (Gaz —2,98; 


2 In a simpler trend regression with raw prices as the dependent variable and an indicator variable for the 
information structure (but with no attempt to control for serial correlation), the information variable remained 
statistically significant in the predicted direction. 

43 Pretax efficiency was computed because taxes do not represent a welfare loss to the overall economy. 

4 Subjects were first asked, “During periods 1-10 of the market you just participated in De, before the 
break), what was the most fair trading price?” This question was followed by an identically worded question for 
periods 11-20 (after the break). The reason for not eliciting fair price responses for periods 1-10 until after the 
` entire experiment was over was that we did not want the questionnaire itself to potentially contaminate actual 
market trading. 
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Table 4 
Buyers’ Post-Experimental Self-Reports of Fair Prices by Tax Regime and Market 





Fair Price in Fair Price In Implied Fair 


Periods 1-10 . Periods 11-20 Price Increase 

Marginal Cost Disclosure 

Market MC1 $2.56* 2.80 0.24 

Market MC2 . 2.49 2.88 0.39 

Market MC3 2.38 2.85 i 0.47 

Average 2.48 2.84 0.36 
Profit Disclosure 

Market PR1 2.61 2.65 0.14 

Market PR2 2.54 2.53 — 0.01 

Market PR3 2.48 2.59 0.13 

Average 2.50 2.59 0.08 
No Disclosure ` ` BN 

Market ND1 2.63 2.86 . 0.23 

Market ND2 = 2.54 2.68 0.14 

Market ND3 2.55 2.83 0.28 

Average 2.57 2.79 ` 0.22 


* Each market observation represents an average for five buyers. 


! 


p<0.01). There was weak evidence that fair price perceptions in the marginal cost mar- 
kets exceeded those in the no-disclosure markets (t2s4=1.71; p=0.10). This post- 
experimental elicitation of buyers’ views of the fairness of prices supports the construct 
validity of the primary findings.: 


IV. Conclusions 


The principle of dual entitlements goes beyond the conventional structure of micro- 
economic theory to predict that an economically predicted price increase might be in- 
fluenced by the perceived equity of that increase. In turn, the theory implies that infor- 
mation related to fair price perceptions can influence actual prices. We find general 
support for this notion in markets that varied only in the information provided to 
traders about a change from an income tax on sellers’ profits to a sales tax on sellers’ 
revenues. 

However, closer scrutiny at the individual market level indicates that the strength 
of this conclusion may be sensitive to particular markets. For example, profit- 
disclosure market PR1. more closely resembled the marginal cost-disclosure markets 
than the other profit-disclosure markets. If the price effects of the principle of dual en- 
titlements can be likened to the emergence of buyer cartels, as we argued earlier, then 
the cartel collapsed in market PR1. This suggests that the research conclusion should ` 
be cast more as an “existence” statement than as a systematic tendency.. That is, we 
conclude that it is possible that information influencing perceptions of fairness can 


‘6 As further support for this conclusion, only in the profit-disclosure markets were self-reported prices 
less than actual negotiated prices. This indicates that subjects’ perceptions of fairness were influenced by the 
profit-sharing graphs, not just actual historical prices, _ 
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result in actual market price differences, but we cannot make the stronger statement 
that this phenomenon will always result. Moreover, the theory of dual entitlements 
might predict best when the expected buyer response (a resistance to trade) is also col- 
lectively profitable to the buyers, as in this study. In a different economic environment 
where the cost of disciplining the sellers is individually and collectively costly, the 
_ theory might predict less well. 

Nevertheless, this study’s findings suggest a broader user focus for the effects of ac- 
counting information; not only might investors be influenced, but agents in a firm’s 
primary markets for goods and services might also be affected. In general, this type of 
phenomenon may help to explain why firms might not always elect the most favorable 
disclosures from an investor’s perspective. 

Although the results suggest implications for strategic disclosure decisions on the 
part of both firms and regulators, it should be stressed that such disclosures were exog- 
enous in this study for reasons of experimental control. An informative extension more 
directly focused on strategic disclosure would allow traders to.choose their disclosures. 
In this manner, both the reactions to disclosures and the incentives to disclose could be 
assessed. We leave the study of this richer environment to future research. 


Appendix 
Market Instructions 





This research is a market experiment in which you will make either buying or selling decisions for actual 
monetary receipts. The cash you accumulate will be paid to you at the end of the market. As we read the in- 
structions, please feel free to interrupt if you have any questions. 

Shortly, we will randomly designate five of you as buyers and five of you as sellers, However, you will 
make your decisions as individuals. During the market, the buyers will purchase imaginary “units” from the 
sellers. You can think of these units as any commodity. 

You will decide the prices at which units are bought and sold. We will discuss how to do this later, but 
first we need to explain how you calculate your profit in this market. 


Buyer’s Profit 


A buyer will receive a profit on sach unit purchased equal to that unit’s redemption value (the monetary 
value of that unit to that buyer) less the price paid. Each buyer will receive a sheet which indicates the re- 
demption value for each unit he or she purchases. 


Seller’s Profit 


A seller receives a profit on each unit sold equal to the sales price less that unit’s cost and less a tax that 
will be explained later. Each seller will receive a sheet which indicates the cost for each unit he or she sells. 


How the Market Works 


The market will consist of many distinct periods. During each period, each seller can sell a maximum of 5 
units, while each buyer can buy a maximum of 6 units. The actual number of units you sell or buy is up to you. 


Example 1: In period 1, a seller decides to sell 3 units. The fourth and fifth (unsold) units do not carry 
over to period 2. In period 2, this seller can again sell either 0, 1, 2, 3, 4, or all 5 of his or her units. 

Example 2:‘In period 1, a buyer decides to buy only 2 units, despite the fact that he or she could have 
bought as many as 5 units. In period 2, this buyer again decides whether to buy 0, 1, 2, 3, 4, or the 
maximum of § units. 


How A Buyer Calculates Profits 


The buyer’s profit from a purchase equals the redemption value of the unit purchased (the monetary value 
_ of the unit.to the buyer) less the market price for the period. Buyer will keep track of their profits on record- 
keeping forms we will provide. 
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The values used in the following example are for illustration only, and have nothing to do with the actual 
values used in the experiment. 


Example: Suppose in period 2, a buyer purchases 3 units as follows: 


ist Unit 2nd Unit ` 3rd Unit 
Bought Bought Bought 
Redemption Value: $85.00 $80.00 $70.00 
Market Price: $50.00 $50.00 $55.00 
This buyer would calculate the profit from these purchases as follows: 
ist Unit 2nd Unit ard Unit 4th Unit 5th Unit 
Bought Bought Bought Bought Bought 
Redemption Value $85.00 $80.00 $70.00 $60.00 $55.00 
< Market Price> <$50.00 > <$50.00> <$55.00> < NA > < NA > 
= Profit =$35.00 =$30.00 =$15.00 = NA = NA 


Thus, the buyer’s Total Profit= $80.00. 








($80.00 =$35.00 + $30.00 + $15.00) 


. There was no profit on the fourth and fifth units since only 3 were purchased. We would pay this buyer 
$80.00 in real cash for this periad. 


How A Seller Calculates Profits 


Each unit sold generates a pre-tax profit for the seller equal to the difference between the market price for 
the period and the cost of the unit sold. The seller then subtracts a tax equal to 50 percent of the pretax profit 
to determine the net profit for that period. Sellers keep track of their profits on record-keeping forms we will 


provide. 
The amounts used in the following example are for illustration only, and have nothing to do with the. 


actual amounts used in the experiment. 
Example: Suppose in period 5, a seller sells 3 units as follows: 


1st Unit 2nd Unit | 3rd Unit 
Sold Sold Sold 
Market Price: $50.00 $50.00 $55.00 
Cost of Unit: $15.00 $20.00 $30.00 
This seller would calculate the profit from these sales as follows: 
4st Unit 2nd Unit 3rd Unit 4th Unit sth Unit 
Sold Sold Sold Sold Sold 
Market Price $50.00 $50.00 $55.00 NA NA 
<Cost of Unit> <$15.00> <$20.00> <$30.00> <$35.00> <$45.00> 
= Pre-tax Profit = $35.00 = $30.00 =$25.00 NA NA 


<50% Tax on Pre-tax Profit> <$17.50> <$1500>  <$12.50> < NA > < NA > 


= Net Profit $17.50 $15.00 $12.50 NA NA 


Eegenen e ee E e ae a e me, 
re TEE Ee erer: AE A ee ee EE e peanae 


Thus, the seller’s Total Profit= $45.00. 








($40.00 = $17.50 + $15.00 + $12.50) 


In the event the Net Profit on a unit results in fractional cents, just truncate to the nearest penny. For 
example, if the market price of the third unit above had been $55.25, then the Net Profit from this unit would 
have been $12.625. Truncated to the nearest penny this would equal $12.62. 
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Security Returns Around Earnings 
Announcements - 


Ray Ball 
S. P. Kothari 
geen of Rochester 


SYNOPSIS AND INTRODUCTION: We gët risk, retum, and abnor- 
mai return behavior in the days around quarterly eamings announcements, 
using a research design that allows risk to vary dally In event time. We test 
several hypotheses concerning the effect on security prices of earnings 
announcements. per se (I.e., ignoring both the sign and the magnitude of 
earnings). The first hypothesis concerns the resolution of uncertainty over 
time. By conveying information about firms’ activities, earnings an- 
nouncements resolve some uncertainty about future cash flows, but the 
concurrent price reactions Increase the variability and covarlability of 
securities’ returns during. the announcements. Thus, it is hypothesized that ` 
-retum variances and betas, and therefore expected returns, Increase during . ` 
eamings announcement perlods (Stapleton and Subrahmanyam 1979; 
Epstein and Turnbull 1980; Choi and Salamon 1989). Previous research has 
. demonstrated anomalous positive abnormal returns during earnings 
announcements (Chambers and Penman 1984; Penman 1984, 1987; Chari 
et al. 1988). Because risk was not allowed to vary In event time in this 
research, it does not adequately distinguish between Increased expected. 
returns and true abnormal returns. We report that abnormal retums remain 
after. controlling for risk increases at earnings announcements. The abnor- 
_ mal returns are not related to any over- or under-reaction by the market to 
earnings news (see, e.g., DeBondt and Thaler 1985, 1987; Bernard and 
Thomas 1989) because we do not condition on the earnings realization. 
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| The second hypothesis (the Information hypothesis) be that the timing 
of an eamings announcement is informative because managers sys- 
tematically announce good news early and bad news late (Givoly and 

- Palmon 1982; Chambers and Penman 1984; Kross and Schroeder 1984). 
The hypothesis predicts that average abnornal returns: (1) are positive at - 
__.the earnings announcement, (2) are negative prior to the announcement, - 
.. and (3) cumulate to zero by the end of the announcement period. Our tests 
- extend those of Chari et a, (1988), Kross and Schroeder (1984) and ` 
_ Chambers and Penman (1984) by examining the pattem of returns around _. 
- -gamings announcements for the. population of stocks.. The pattem: we 
observe is not. as predicted by the information hypothesis. | 
Finally, we -investigate whether cross-sectional variation. in. an- 
nouncement-period risks and returns is a function. of firm size, which Is a 
proxy for the increase in.information arrival during earnings announcement 

_periods.. The evidence reveals that, after controlling for risk increases, 
abnormal returns generally are positive and decreasing In firm size. For the | 
smallest size decile, abnormal! returns in the ten days up to and including. 

_ the eamings announcement are approximately 1.75 percent In the average 
quarter, or approximately 7 percent over only 40 trading days per year. This 
adds to an Impressive body of size-related anomalies. ` 

We use these results to reexamine Hand’s (1990) reinterpretation: of 
‘the functional fixation hypothesis.. Hand investigated quarterly earnings 
that included previously announced book gains from debt-equity swaps. He 
- distinguished between “sophisticated” and “unsophisticated” investors, . 
hypothesizing that only the former correctly comprehend the different 
_ implications of swap gains and other components of earnings. He found: 
' that abnormal returns. increase ina variable representing the Interaction 
` between the swap gain and a proxy for. the probability that the marginal . 
investor ts unsophisticated. We are skeptical about both the hypothesis and ` 

_ whether it predicts the observed result. We Interpret Hand's result as ~ 
similar to the puzzling but typical size effect around earnings announce- 
ments. It seems unlikely.to be due to swap gains, to the sign or magnitude 

_ .Of earnings information released at the time, to errors in measuring the 
earnings information released, or to functional fixation. 


Key Words: Earnings announcements, Effi cient “markets, “Functional 
‘fixation, Risk changes. ` 


Data Availability: SS request from authors. 


HE remainder of this paper consists of the following. Section I of this article re- 
views the-literature on security risks, returns and abnormal returns at informa- 
tion announcements. including the effect of firm size. Section II describes the 
` data, research design, and results of testing various hypotheses. Section III reexamines 
Hand’s tests of the functional fixation hypothesis, using both his swap data and data for 
almost all the New York and American Stock ee iia = aaa 
Section IV provides our. conclusions. 8 
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L Hypotheses e on Information Arrival and the Evolution of Risk Over Time 


Uncertainty Resolution Hypothesis 


Assume that all earnings announcements are “routine, ” which we define as a ran- 
dom drawing from a known earnings distribution, at a known date. This is a reasonable 
‘assumption for most firms, though not necessarily for extreme earnings realizations 
(Kross and Schroeder 1984; Chambers and Penman 1984). The effect of routine infor- 
mation on the evolution of security risk over time is addressed by Robicheck and Myers 
(1966), Ball and Brown (1969, 315-16), Stapleton and Subrahmanyam (1979), Epstein 
- and Turnbull (1980), Choi and Salamon (1989), and Holthausen and Verrecchia (1988). 
By conveying information to investors concerning firms’. activities, routine earnings 
announcements resolve some uncertainty about future cash flows. However, the in- 
creased flow of information increases the variability of returns during earnings an- 
nouncement periods. Assuming earnings information is cross-correlated, covariances 
among returns of securities announcing together and thus their covariances with the 
market portfolio also, are predicted to increase during earnings announcement 
periods.' The market portfolio’s variance is affected only. trivially because it is dom- 
inated by covariances among the returns. of non-announcing securities, which on any 
day are a clear numerical majority, so announcing-securities’ relative covariances (i.e., 
betas) increase in event time. Thus, ‘announcing firms’ return: variances, betas, and 
(therefore) expected returns are expected to increase during earnings announcement | 
periods. We. call this the uncertainty resolution hypothesis. Unlike prior research (Pen- 
man 1984, 1987; Chari et al. 1988) we distinguish between abnormal returns and the 
effect of risk increases on total returns, by allowing betas to change daily in event time. 


Announcements Are Per Se Informative Hypothesis or Information Hypothesis 


If earnings announcements are not routine, then per se they can convey informa- 
tion. Specifically, if managers announce good news earnings early and delay earnings 
reports that contain bad news (Niederhoffer and Regan 1972; Givoly and Palmon 1982;. ` 
Chambers and Penman 1984; Kross and GE 1984), then the testable EE 
for security returns are: 


1. Firms announcing late signal bad news and thus earn negative average abnor- 
mal returns around their expected announcement dates (which precede their 
actual dates). Further, firms that do not announce earnings early signal the 
absence of good news (see 2.below) and thus earn negative average abnormal 
returns even prior to their expected announcement dates, commencing on or 
after their earliest feasible earnings announcement dates. 

2. Firms announcing early signal good news and thus earn positive average abnor- 

_ mal returns at the time of their announcements. 

3. Since essentially all firms announce their earnings each quarter, the fact that an 
earnings announcement occurs at some time is not per se informative. Good and - 
bad news thus combine to produce average abnormal returns that cumulate to 
zero at the end of the earnings announcement period. 


‘ Tt isnot helpful to argue that information is firm-specific and thus influénces only securities’ “residual” 
risks. If this premise is integrated across all information sources, then the market portfolio has a vanishingly 
small variance: i.e., essentially all risk is diversifiable. Earnings information thus is a priori unlikely to be 
independent of covariance effects (Ball and Brown 1968, fn. 40). 
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The information hypothesis therefore predicts a v-shaped pattern of average abnormal 
returns in event time. We test this prediction by examining the pattern of abnormal re- 
turns for the population of firms, avoiding measurement error in classifying individual 
announcements as early or late. 


Cross-Sectional Variation in Uncertainty Resolution 


The final hypothesis addresses cross-sectional variation in geet resolution, 
with size as an observable proxy for the amount of information arriving concurrently 
with earnings announcements.? Therefore, under the uncertainty resolution 
hypothesis, announcement-period risks, risk changes, and total returns are expected to 
decrease in firm size. Notwithstanding the extensive evidence of size-related anomalies 
(Banz 1981; Reinganum 1981), abnormal returns are not ES to be a function of 
size. ` 

We investigate the effect of firm size for three subsidiary reasons. First, we report 
estimates of risk that are allowed to vary daily in event time, to assess whether the nega- 
tive relation between size and announcement-period abnormal returns documented by 
Chari et al. (1988) is due in part to risk misestimation. Second, we can assess the effect 
of increased turnover around earnings announcements on risk estimates (Scholes and 
Williams 1977). Third, understanding the relation between size and abnormal returns at 
earnings announcements helps in assessing whether Hand’s (1990) results are due to 
that relation | or to functional fixation in the context of swap gains. 


II, Empirical Analysis and Results 

Data 

The sample is selected from all NYSE-AMEX firms on both the COMPUSTAT: 
Quarterly tape in any quarter q from the first quarter of 1980.to the first quarter of 1988 
and the Center for Research in Security Prices (CRSP) daily returns tape. The 51,178 
selected firm-quarters satisfy the following data requirements: earnings for quarters q 
and q—4, E, and E,-4; market value of equity at the beginning of quarter q, MV. 
earnings announcement date for quarter q; and daily returns for a 21-day window cen- 
tered on the earnings announcement date.’ Market capitalizations of the sample firms 


- range from approximately $1 million to $100 billion, with a median of $194 million and 
a mean of $971 million. 


Research Design. 


Let 7 denote event time, with the earnings announcement date denoted as geb 
7=0. The Capital Asset Pricing Model (CAPM) is estimated in risk premium form, sep- 


2 While the phenomenon is not well understood, earnings realizations appear relatively more uncertain for 
smaller firms (Bathke et al. 1989; Collins et al. 1887; Bamber 1986). Changes in trading volume and return vart- 
ance around earnings announcements also are size-dependent (Grant 1980; Atiase 1985; Lobo and Mahmoud 
1989). 

3 Our sample suffers from a survivorship bias because the COMPUSTAT Quarterly tape contains only the 
surviving firms. We ignore the survivorship bias problem because Chari et al, (1988) who control it, find evt - 
dence similar to that reported here. 
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arately for each of 21 event-time days r= —10 to +10:* 
Rar —Ra= aA, +B, {Rm Ra) t+ Enr (1) 
where: 


R,,,= daily return on security i for calendar day t and event-day 7, 
Rm =CRSP equal-weighted market return for calendar day t, 
R,,=risk-free rate of return on calendar day t based on the monthly T-bill 
rate of return, 
a, and 8, are constants representing Jensen’s (1968) alpha (abnormal return) and 
the CAPM relative risk on event-day 7, and 
€4,==a normally distributed disturbance term. 


The regression slope estimates the pooled cross-sectional average relative risk 8, on 
event-day 7. Because the sample firms do not have identical betas, the ordinary least 
squares (OLS) estimate of 8, is unbiased and consistent, but some statistical precision is 
sacrificed. However, with a sample size of more than 50,000 observations, statistical 
precision is not a major concern. 

Because many firms announce their earnings on the same calendar date, and be- 
cause of the possibility of industry grouping in announcement dates, the disturbances 
of equation (1) could, on average, be positively cross-correlated and the significance 
of the estimated parameters could be overstated (Bernard 1987). To reduce this 
problem, on each calendar date, we form an equal-weighted portfolio of the firms 
announcing on that date, similar to Chari et al. (1988). The equation (1) is estimated us- 
ing 2,203 portfolio observations, which is the number of trading days over the sample 
period minus those days on which no firms announced earnings. 


Evidence: Returns, Abnormal Returns, and Risk Estimates 


Table 1 reports statistics from the 21 daily event-time regressions. We do not per- 
form significance tests of changes in event time because (1) given the large sample size, 
small differences are likely to be significant at conventional levels, and (2) the event- ` 
time observations are not strictly independent. The second column reports average 
daily event-time returns. The average return on day 0 (0.084 percent) is larger than the 
average return on any of the ten prior or five subsequent days. This is consistent with 
the uncertainty resolution hypothesis, which predicts announcement-induced in- 
creases in relative (beta) risk and thus in expected return. There is no clear evidence of 
unusual average returns on day —1 or day +1.° 

The third column reports the standard deviation of daily returns for each event- 
time day. If the information in earnings reports is primarily firm-specific, then using 
portfolio returns on each calendar date would understate the true increase in return 
variance of individual securities around their earnings announcement dates. We there- 


‘ This is an adaptation of Ibbotson’s (1975) technique. It is used by Brennan and Copeland (1988) and 
Kalay and Lowenstein (1985) with daily returns and by Chan {1988} and Ball and Kothari (1989) with monthly 
and annual returns. 

* The sequence of event days +6 through +10 exhibits five daily average returns, each of which is greater 
than each of the prior 16 days’ returns. Since this pattern is revealed in neither the abnormal returns nor the 
SC risks, it implies a market-index effect that we cannot explain in the context of the present research 

gn. 
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Table 1 


Daily Average Total Returns, Standard Deviation of Returns, Abnormal Returns, 
and Systematic Risk Estimates on 21 Days Centered Around 
Firms’ Quarterly Earnings Announcements* 


Day R, Onr ` t-statistic 7 t-statistic CAR, 


Jr % % a, - fora=0 8, for8=1 Adjusted R? % 
—10 0.074 2.87 0.001 0.04 0.98 —0.86 0.445 0.001 . 
—9 0.079 2.97 9.010 0.54 1.02 -` 0.80 ` 0.487 0.012 . 
d 0.081 3.04 ~ 0.003 - —0.14 1.03 1.35 0.494 0.009 
-7 0.038 3.17 ~- 0.027 — 1.38 1.02 0.89 0.485 0.018 
-6 0.030 3.19 ~ 0,002 —0.09 ‘1.02 0.80 — 0.484 ~ 0.020 
-5 0.050 2.98 0.026 ' 1.27 1.03 1.30 0.475 0.003 
—4 0.041 2.84 ~ 0.036 — 1.90 0.98 —0.92 0.480 ~~ 0.030 
wb 0.052 3.15 0.033 ` 1.74 1.00 0.15 0.480 0.003 — 
—2 0.017 3.16 0.004 - 0.17 1.01 0.41 0.441 0.007 
—1 0.058 3.38 0.078 3.50 1.07 2.71 0.438 0.085 
0 0.084 ° 4.02 0.066 2.35 1.05 1.64 0.321 0.151 
+1 0.058 3.98 0.015 0.56 1.11 3.64 0.3868 0.166 
+2 0.048 3.47 0.002 0.08 1.08 . 2.98 0.423 0.168 
+3 0.035 3.26 ~~ (.011 — 0.52 1.04 1.67 0.460 0.157 
+4 0.048 3.17 — 0.006 — 0.28 - 1.09 3.77 0.486 0.152 
+5 0.075 3.13 ~~ 0.007 — 0.30 1.08 2.94 0.417 0.144 
+6 0.113 3.20 0.025 1.19 0.97 — 1.23 0.417 0.170 
+7 0.127 2.99 0.027 1.30 GER, * 4.53 0.486 0.197 
. +8 0.085 3.01 . — 0.031 —1.53 0.99 —0.42 `. 0.443 0.166 
+9 0.100 3.03 ~ 0,015 -—0.74 103 1.28 0.457 0.150 
+10 9.120 ` 3.08 _ 0.023 1.10 1.08 l 2.45 0.480 0.174 


* The sample consists of 51,178 NYSE-AMEX firm-quarter observations from 1980-1988. Day 7 is trading 
day relative to the earnings announcement date; R, is the equal-weighted total return on event day 7; ge, is the 
cross-sectional standard deviation of event-day 7 returns, using return on one randomly selected firm from 
each calendar date; á å and Ê are abnormal return and systematic risk estimates obtained from: 


Ba Bas, LÉI, — Bal Lëns 


where R,,, is return on an equal-weightad portfolio of all the firms reporting quarterly earnings on calendar 
date t (2,203 distinct calendar dates), t-statistics are for the null hypotheses a,=0 and 8,=1, for each event- 
day 7. CAR, is the cumulative abnormal return from event-day —10 to 7. 


fore randomly select one security on each calendar date instead of using portfolio re- 
turns to calculate standard deviation of returns in event timè.‘ The standard deviations 
of returns increase during the earnings announcement period, as first observed by 
Beaver (1968). The day 0 standard deviation is 30 percent greater than its average over 
days —10 through —2 and +2 through +10. 

The fourth column reports event-time average abnormal returns, with t-statistics 
reported in the fifth column. Even after controlling for risk shifts, firms earn reliably 


* Because the amount of error variance in daily returns due to nonsynchronous trading on the earnings an- 
nouncement days is likely to be smaller than on other days, this procedure induces a bias against finding the 
hypothesized variance increase in returns around the earnings announcement dates. The Scholes and Wil- 
liams (1977) beta estimates reported later are consistent with nonsynchronous trading effects being smaller 
around the earnings announcement period. A second reason for a downward bias is that announcement- 
period returns are less positively skewed, which reduces the announcement-period return variance estimates 
(McNichols 1888). 
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_ Figure 1 | 
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positive abnormal returns on event day —1 (0.078 percent, t-statistic = 3.50) and day 0 
(0.066 percent, t-statistic = 2.35). Although these abnormal returns are small.in magni- 
tude, they are inconsistent with the uncertainty resolution hypothesis, This anomaly is 
not explained by the day-of-the-week.seasonal observed in stock returns (French 1980; 
Gibbons and Hess 1981; Keim and Stambaugh 1984). _ ; 

To test the information hypothesis, we focus on the eamulative abnormal returns 
(CARs) reported .in the last column of table 1 and presented in figure 1. They reveal a 
step increase on days r= —1 and 0. There is weak evidence of negative preannounce- 
ment abnormal returns during event days —8 through —6, which is consistent with the 
information hypothesis, but the estimates are statistically insignificant. at the conven- ` 


F Day-of-the-week return SE could affect announcement Jerid returns if there were.a daily gea- 
. sonal in earnings announcements. This is unlikely to explain the abnormal returns we observe because (1) the 
CAPM in equation (1) controls-for the market return, and (2) there is little day-of-the-week ‘seasonal in earn- 
-ings announcements. For our sample, the relative frequencies of earnings. announcements range from 16.6 to 
22.4 Ge over the five weekdays (see also Penman 1887} 
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tional level. There is no evidence of the predicted v-shaped pattern of abnormal returns, 
which are not predominantly negative over the preannouncement period and do not 
cumulate to zero by the announcement date (they are large and positive). This result 
was not altered by examining event days r= —30 to —11, in case some firms delay bad 
news by more than ten trading days (approximately two weeks). There was no evidence 
of negative abnormal returns over the extended period: daily average abnormal returns 
were essentially zero and cumulated to 0.01 percent by r= —11.° 

_ The sixth and seventh columns in table 1 report event-time beta estimates and asso- 
ciated t-statistics against the null hypothesis that the betas equal unity, the expected 
value of securities’ relative risks. There is evidence of a small increase in relative risk at 
or after earnings announcements. Each of the betas on days —1, 0, and +1 exceeds 
each of the previous ten event-day betas; on average, they are 6.7 percent larger. The 
t-statistics on days r= —1 and +1 are 2.71 and 3.54, which means that betas on these 
event days reliably exceed 1. In general, the estimated betas around earnings announce- 
ment days are, in absolute terms, only marginally greater than unity. The average of the 
ten postannouncement betas is 3 percent higher than the preannouncement average.’ 
Surprisingly, the beta on day 7=0 is indistinguishable from unity. Overall, the evidence 
is that, around earnings announcements, there is only slightly more information than 
normal that covaries with the market. Given the relatively large increases observed in 
standard deviations of returns, the implication is that earnings information causes pri- 
marily diversifiable risk. 
. Similarly, the smaller adjusted R? estimates reported in column 8 of table 1 on dive 

7=0 and +1 are consistent with the hypothesis that increased announcement-period 
volatility is primarily unsystematic De, not a marketwide effect). The regressions use 
equal-weighted portfolio returns, so the lower R? estimates imply that earnings 
information is highly cross-correlated among firms announcing on a particular day, but 
that the information is not unusually correlated with the market.’® This suggests indus- 
try effects or other submarket commonalities in firms’ earnings information. The impli- 
cations for the uncertainty resolution hypothesis are unclear. | 


Evidence: Small and Large Firms ` 


‘Tables 2 and 3 report statistics, corresponding to those rT in table 1, for the 
‘smallest and largest deciles of firms. We form equal-weighted portfolios of all the small 
or large firms announcing earnings on a common calendar date. Size is measured as 
equity market value and proxies for cross-sectional variation in the amount of informa- 
tion arrival around earnings announcements. The information hypothesis implies that 
smaller firms have larger standard deviation and beta increases around earnings an- 
nouncements. 


oi asses pce cect dawns via E 
i ca eg Unig peep beter de bl ile wk ah st apie panna 
announcement dates. We examine this possibility ee 

‘returns from thelr earliest feasible earnings announcement dates following a fiscal quarter-end. These results 
are discussed later in this section.. 

> We will show later that this is not explained by increased trading volume around earnings announce- 
ments (Beaver 1968) affecting the beta estimates (Scholes and Williams 1977). ` 
l © If the earnings information and the associated increased volatility were cross-correlated neither with 
other announcing firms nor with the market, then at a portfolio level (Le., the portfolio of firms announcing 
Barings) on B common calendar date) 1): would) te divarsiced EE ee 
regression Hie on earnings announcement days. 
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Table 2 


Daily E Total Returns, Standard Deviation of Returns, Abnormal Returns, 
and Systematic Risk Estimates for the Decile of Smallest Market Capitalization 
Firms for the 21 Days Around Their Quarterly Earnings Announcements* 


T % % â, for a=0 By: for Bol . Adjusted R? % 

—10 0.157 4.33 0.058 0.84 0.88 ~1.57 0.071 0.058 
—8 0.159 4.64 0.112 1.69 0.76 —3.22 0.057 0.170 
-8 | 0.126 4.41 0.022 0.31 0.88 —1.58 0.072 0.192 
—7 0.182 4.23 0.092 1.38 0.94 —0.82 0.086 0.284 
-6 0.112 4.13 0.068 - 1.06 0.76 BEI) 0.062 0.352 
—p 0.138 4.04 0.024 0.38 0.91 —1.32 0.093 0.376 
-4 0.119 4.19 0.034 0.50 0.78 —2.88. 0.057 0.410 
~3 0.247 4.76 0.143 1.89 0.73 —3,11 0.039 0.554 
-2 0.210. 4.41 0.121 1.74 0.79 2.71 0.057 0.675 
—1 0.287 5.05 0.213 2.72 0.97 —0.35 0.089 0.888 
D 0.501 6.38 0.543 5.45 1.11 1.01: 0.056 ~~ 1.431 
+1 0.328° 611 0.308 3.19 0.87 —1.20 0.036 1.739 
+2 0.087 4.84 0.004 0.05 0.93 —0.84 0.087 1.743 
+3. —0.063 4.56 — 0.092 -1.25 0.78 —2.85 0.049 - 1.851 
+4 Dim 4.60 0.101. 1.48 0.92  —1.02 0.073'. 1.752 
+5 —0.017 4.48 ~0.118 —1.62 0.91 —1.01 0.056 1.633 
+6 0.056 4.34 —0.023  —0.33 0.83 —2.17 0.081 1.811 
4+7 0142 4.60 0.071 0.98 1.16 1.78 ` 0.087 1.682 
+8 0.035 4.23 —0.003  -—0.04 0.87 —1.57 0.080 1.679 
+9 0.104 4.46 -—0.001  —0.02 0.86 —1.69 0.058 1.678 
+10 0.012 4.04. —0.110 —1.74 100° DA? 0.094 1.568 


* The sample consists of the decile of smallest market capitalization stocks selected from among 51,178 
NYSE-AMEX firm-quarter observations from 1980-1988. Day 7 is trading day relative to the earnings anm- 
nouncement date; R, is the equal-weighted total return on event day 7; og, is cross-sectional standard devia- 
tion of event-day 7 returns, using the return on one randomly selected firm from each calendar date; & and ë 
are abnormal return and systematic risk estimates obtained from: ; 

Raer Raad, + GAR ae — Bal t Em 


where R.. is the return on an equal-weighted portfolio of all the firms reporting quarterly earnings o 
calendar date t (1,721 distinct calendar dates). t-statistics are for the null hypotheses a,=0 and Sch for SR 
event-day r. CAR, is the cumulative abnormal return from event-day —10 tor. 


The small firms’ announcements occurred on 1,721 unique GER days. Their an- 
nouncement-period total returns are relatively large, compared both to the returns of 
small firms on days farther away from the.earnings announcement date and to the 
returns of firms on average during equivalent days (table 1). For example, average re- 
turns on days —1, 0, and +1 are 0.287, 0.501, and 0.328 percent. In contrast, the daily 
average total return over days — 10 through —2 is 0.161 percent. and over +2 through 
+10 the average is 0.048 percent. 

The 6.38 percent standard deviation of returns on event day r=0 is 1.45 times the 
18-day average of 4.41 percent for these firms (excluding — 1 through +1). Compared to 
the corresponding factor of 1.30 for firms in general in table 1, this implies a greater 
relative amount of information arriving at the time of small firms’ earnings announce- 
ments. Systematic risk estimates in event time are reported in column 6 of table 2. The 
announcement day beta is 1.11, which is larger than the beta on surrounding days. 
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Table 3 


Daily Average Total Returns, Standard Deviation of Returns, Abnormal Returns, 
and Systematic Risk Estimates for the Decile of Largest Market Capitalization 
Firms for the 21 Days Around Their Quarterly Earnings Announcements* 





Day R, ge t-statistic . t-statistic CAR, 
T % % D for a=0 ĝ, for ĝ=1 Adjusted R? ` o 
—10 0.037 1.97 -0.040 0.97 0.90 —2.39 0.265 — 0.040 
-9 0.047 1.98 0.003 0.08 0.96 — 0.97 0.295 -0.037 
-8 0.059 2.05 —0.036  '—0.89 1.09 2.18 0.352 —0.073 
-7 0.019 1.97 —0.064  ~1.64 0.065 . 1.27 0.312 | ~0.137 
-6 0.018 2.02 —0.049 1.26 1.01 0.25 0.324 ~ 0.188 
~5 0.078 1.98 0.055 1.41 1.04 0.98 0.336 —0.131 
-4 0.043 1.84 0.010 0.29 1.02. 0.53 0.366 —0.121 
~3 0.037 2.00 0.026 0.70 1.07 1.77 0.361 — 0.094 
-2 0.087 1.96 007 ~-0:76 1.07 1.83 0.379 —0.122 
—1 0.018 2.17 0.027 0.68 1.08 1.89 0.337.° 0.084 
O 0.012 2.54 0.023 0.48 1.05 1.01 0.260 —0.072 
+1 0.058 2.47 ~0.001 0.02 1.08 1.65 0.280 —0.073 
+2 0.044 2.15 -0.037 0.94 1.07 167 0.337 —0.110 
+3 0.002 2.11 0.002 0.05 1.00 0.17 0.308 -0.108 
+4 0,025 2.22 0.027 0.69 . 114 3.35 0.367 —0.080 
+5 0.084 1.99 0.020 0.54 1.12 3.11 0.395 — 0081 - 
+6 0.160 2.01 0.082 ` 2,12 0.97 — 0.87 0.268 0.021 
+7 0.107 1.95 0.011 0.30 1.11 2.04 0.408 0.032 
+8 0.094 1.97 -0.025  —0.83 0.80 -2.32 0.254 0.007 
+9 0.141 1.99 0.023 0.61 0.93 -1.72 0.289 ` 0.030 
+10 0.150 2.00 0.017 0.48 1.08 1.42 0.327 0.047 


* The sample consists of the decile of largest market capitalization stocks selected from among 61,178 
NYSE-AMEX firm-quarter observations from 1980-1988. Day 7 is trading day relative to the earnings an- 
nouncement date; R, is the equal-weighted total return on event day T; ge, is cross-sectional standard devia- 
tion of event-day 7 returns, using the return on one randomly selected firm from each calendar date; & and Â 
are abnormal return and systematic risk estimates obtained from: 

Rar—Rp=a,+ BAR Ra) + ens } 


where Ra is the return on an equal-weighted portfolio of all the firms reporting quarterly 
calendar date t (1,283 distinct calendar dates). t-statistics are for the null hypotheses a,=0 and §,=1, for Sch 
event-day r. CAR, is the cumulative abnormal return from event dog —10 to 7. 


Thus, estimates of small firms’ event-time systematic risk are consistent with both un- 
certainty resolution and the. smaller firms’ earnings being proportionally more infor- 
mative. 

The daily abnormal returns in column 4 and the CARs in the last column reveal that 
small firms earn significant positive abnormal returns prior to and around earnings 
announcement days. The a, values for event-days — 1, 0, and +1 reliably exceed zero at 
the conventional significance level. Small firms earn an average cumulative 1.33 per- 
cent abnormal return over a five-day trading period from r= —3 to +2. Thus, four quar- - 
terly earnings announcement periods each year provide an opportunity to earn a 
5.32 percent abnormal return from holding the portfolio of small stocks over a total of 
20 trading days per year. In other words, a substantial proportion (if not all) of the size- 
effect observed in daily returns (see, e.g., Banz 1981; Reinganum 1981; Keim 1983) 
could be because of return behavior around earnings announcements. These average 
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Figure 2 
Cumulative Abnormal Returns Around Earnings Announcements ` 


CAR in Percent 
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abnormal returns appear too large to attribute to biases in estimating risk, given the 
average daily risk premium. 

| The small firms’ CAR is plotted in figure 3. Comparison with the CAR for all firms 
in figure 1 reveals larger abnormal return magnitudes around the small firms’ earnings 
announcements. The presence of positive abnormal returns around earnings an- 
‘nouncements for small firms is inconsistent with the uncertainty resolution hypothesis. 
The CAR pattern is inconsistent with the information hypothesis." 

Table 3 reports corresponding estimates for the decile of largest firms in equity 


EE E E EE 
earnings announcement date. We found no evidence that the small firms earned negative cumulative average 
- returns in the days preceding their actual earnings announcement dates. We did not examine returns earlier 
than r= —52 trading days { ee ae a a E 

earnings announcement. 


i olfoct of the previous quarter's earnings 
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market value, Pp 1,283 portfolio return observations.“ Their total and abnormal 
returns around earnings announcement days are as expected; only one of the 21 event- 
day abnormal returns reliably differs from zero at the 5 percent level, which is approxi- 
mately the relative frequency expected by chance. The standard deviation of returns on 
day 0 is 2.54 percent, which is 1.26 times the average during the interval —10 through 
+10, excluding —1 through +1. However, large firms’ relative risk estimates do not 
appear to increase around their earnings announcements. 

The CAR values reported in table 3 and graphed in figure 2 suggest that large firms 
earn only a 0.047 percent total abnormal rate of return over the 21 event days. The CAR 
pattern for large firms is not consistent with the information hypothesis because the 
small negative abnormal returns in the preannouncement period are not reversed by 
the earnings announcement day. In addition, none of the preannouncement or an- 
nouncement-period abnormal returns is significantly different from zero. 


Evidence: Nonsynehronous’ Trading 


One explanation for betas increasing for small but not large firms is the differential 
effect of nonsynchronous trading. The effect of nonsynchronous trading on relative- 
risk estimates for small firms is likely to be lower on the earnings announcement day. 
This is because the fact that even small firms’ stocks are traded frequently on earn-. 
ings announcement days. The small firms’ announcement-day beta is expected to differ 
_ from the nonannouncement-day beta, which is likely to be biased downward. Hence, 
estimated betas of small firms would increase around earnings announcements. In con- 
trast, large firms are more actively traded over the entire 21-day period and thus are 
likely to exhibit smaller beta shifts. We estimate Scholes and Williams (1977} betas, 
which reduce the bias due to nonsynchronous trading, to discriminate between alter- 
native explanations for beta shifts. 

The Scholes and Williams betas exhibit a more DEE Geesse in small firms’ ) 
relative risk around earnings announcements than OLS betas, although both types ex- 
hibit higher volatility. over event time. Thus, consistent with the uncertainty resolution 
hypothesis, controlling for nonsynchronous trading results in an increase in small 
firms’ betas on earnings announcement days. The increase, however, is too small to 
account for the increase in total returns for small firms around earnings announcement 
days: on the event days in which small firms earn the largest abnormal returns (days 

—1, 0, +1), the OLS and the Scholes-Williams estimated betas are essentially identical. 
As a result, the abnormal returns around small firms’ earnings announcements are not 
explained by risk increases. 

The Scholes and Williams beta estimates of large firms do not reveal substantial 
changes in event-time and generally are smaller than OLS betas. This is consistent with 
an upward bias in the SES firms’ OLS betas because of their gang market turn- 

over. 


12 The number of unique calendar days on which large firms’ announcements occur is smaller than that for 
small firms, because large firms’ announcements are more likely to bunch in calendar time. Smith and Pourciau 
(1888) report that large firms are more likely to have December 31 year-ends. 
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Il. Reexamination of Hand’s Tests of the Functional Fixation Hypothesis . 


We first summarize Hand’s (1990) hypothesis and his tests using debt-equity swap 
data, and argue that his hypothesis does not predict a size-related effect. We present 
evidence showing an effect similar to what he observes in swap-gain firms for the pop- 
ulation of NYSE-AMEX firms. We then analyze Hand’s sample of swap-gain firms and 
demonstrate that the effect Hand observes in his data is indistinguishable from the 
firm-size effect." 


Summary of Hand’s Hypothesis and Tests 


Under the extended functional fixation hypothesis (EFFH), only “unsophisticated” 
investors fail to correctly distinguish the valuation implications of components of re- 
ported earnings and their response to reported earnings is mechanistic, which governs 
stock valuation when the marginal investor (who is assumed to alone determine price) 
is unsophisticated. Hand tests this hypothesis using debt-equity swap data. During 
1981-1984, some firms swapped debt, that was selling below par, for equity. The result- 
ing nontaxable book gain flowed into quarterly earnings. The swap and its effect on 
earnings typically (but not always) were reported in The Wall Street Journal within two ` 
days. The earnings announcement was a median of 44 days after the swap announce- 
ment. 

Hand (1990) hypothesizes that unsophisticated investors (1) perceive the swap gain 
to be real; (2) react to the swap gain (again) at the time of the quarterly earnings an- 
nouncement; yet (3) might not realize that the debt-equity swaps are transitory one- 
period gains. Hand hypothesizes a positive stock price reaction, increasing with the 
swap gain, but only when the marginal investor is unsophisticated. His tests use a 
proxy for the probability that the marginal investor at the time of the earnings an- 
nouncements is unsophisticated, as this cannot be observed.'* One proxy that Hand 
uses is a negative, log-linear transformation of firm size, denoted by PR,:'5 


PR,=[log(max MV)—log(MV,)]/[log(max MV)—log(min MVJ}, 


where max MV and min MV are, respectively, the maximum and minimum market 
values of equity of the NYSE-AMEX firms at the end of 1982, and MV, is the market 
value of equity of firm i at that date. 

Hand then estimates the following regression (all right-hand side variables are de- 
flated by the market value of equity at the beginning of the two-day announcement 
period): 


AR,=a,+a,UZ,+a,;SGAIN,+ a,(PR; x SGAIN,) + €,, (2) 
where: 


AR,=two-day earnings announcement period stock prediction error, 
UZ,=unexpected earnings, defined as reported earnings minus the swap gain 
and the Value Line earnings forecast, 


H We are grateful to John Hand for supplying these data. 

14 It is not even given an exogenous specification in the theory. Hand (1990, 741) defines an “unsophisti- 
cated” investor as one who “can be systematically misled by firms’ accounting methods and choices,” which is 
tantamount to defining the causal variable in terms of its result. 

‘Ss Hand uses two other scaled measures as alternative proxies: number of institutional holders of the stock 
and proportion of the stock held by institutions. Because proxies are positively correlated with PR,, which is 
based on firm size, and yield very similar results (Hand 1990), we work with PR, alone, 
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SGAIN,=earnings resulting from the debt to equity swap, 
PR,=probability that the marginal investor pricing the firm’s stock is unsophis- 
ticated, and 
Gs, Q2, Qa, and a, are regression constants and e, is a disturbance term. 


Ordinary least squares and weighted least squares results reported in Hand (1990) 
reveal that a, is positive and statistically significant. Hand interprets this as evidence 
inconsistent with the efficient markets hypothesis, but consistent with the EFFH. 


A Critique of Theory and Proxy Variables 


We emphasize three properties of the swap data. First, as observed above, PR, is a 
negative log-linear transform of firm size. Second, the swap gain and the interaction 
(PR,xSGAIN,) variables are significantly positively correlated: the product moment 
correlation between these two variables is 0.97 (Hand 1990, table 4). The swap gain vari- 
able thus is likely to be positively correlated with PR,: the product moment correlation 
between these two variables is 0.33, p-value<0.01. Finally, the unexpected earnings 
variable is essentially uncorrelated with the swap gain variable (Hand 1990, table 4), so 
the coefficients on these variables are unaffected by including them in a multiple re- 
gression. l 

Inadequate Theory. Hand’s model is not developed in terms of excess demand. The 
model appears to confuse stocks and flows of securities: it assumes a fixed supply of 
securities, independent of price, and thus of events such as earnings outcomes and 
swap gains, But fixed supply connotes the total number of shares outstanding at a point 
in time, not those exchanged. The model assumes that degree of sophistication is a 
characteristic of demand alone. For example, if the seller is sophisticated and the ask 
price does not reflect swap gains, then price is not determined by an unsophisticated 
buyer alone. 

_ Equilibrium requires that every investor must be “marginal.” For an equilibrium ta 
exist at closing in Hand’s model, it therefore must be prohibitively costly for sophisti- 
cated investors to trade with others. Closing prices cannot exceed sophisticated inves- 
tors’ assessments of a stock’s worth, at the margin, a condition that is violated in 
Hand’s model; sophisticated investors would be net sellers,’ In summary, we believe 
the market setting assumed in Hand (1990) is too simple to predict the hypothesized 
price behavior. 

- - We also are not convinced that Hand’s model allows for predictions unambigu- 
ously different from those of market efficiency and functional fixation hypotheses, If 
“efficiency” is defined as a property of competitive capital markets, then it is a state- 
ment about returns in relation to economic costs (see Ball 1990). An important issue (e 
whether the implications of market efficiency in this context are as Hand portrays 
them. If unsophisticated investors are simply those investors with higher costs of pro- 
cessing information (including the valuation implications of various earnings compo- 
nents), then in competitive equilibrium sophisticated investors can be inframarginal 
and can earn higher returns than the unsophisticated. (Their rents would be equated 
across securities, however, so no size effect would be predicted.) Positive information 
processing costs are not inconsistent with market efficiency. 


je Hand (1990, fn. 9) ignores such behavior on the part of the sophisticated investors (1) for the sake of 
analytical tractability; and (2} due to his belief that the costs and risks to sophisticated investors from trading 
with unsophisticated investors are large. 
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Size as a Proxy. It is even more difficult to see how unsophisticated:investors’ be- 
havior can vary systematically with the firm-specific proxies used. Hand’s hypothesis 
requires some barrier to prevent sophisticated investors from trading with the 
unsophisticated. The proxies used for the likelihood that the marginal investor is 
sophisticated are essentially costlessly observable.since all are public information. We 
therefore do not see a clear theoretical case that unsophisticated investors increase the 
likelihood of higher prices in swap-gain quarters,, or that size is an appropriate proxy in 
this context. 

We are skeptical of using size as a proxy for other reasons as well. The empirical 
relation between size and abnormal returns is well known (see e.g., Banz 1981; 
Reinganum 1981). Using size as a proxy for any independent variable, when the depen- 
dent variable is returns or abnormal returns, increases the likelihood of rejecting the 
null hypothesis. It is a low-power test to discriminate between the extended functional 
fixation hypothesis and size related effects on security prices. This is especially true at 
the turn of the year (Keim 1983) and, as previously reported in section 2, at earnings an- 
nouncements. 

Further, if size proxies for investor sophistication in some De, ‘swap-gain} quarters, 
then consistency requires it should also serve as a proxy of a similar nature in other 
quarters. Then our results in section 2 would suggest that unsophisticated investors 
De, primarily the small-firm investors) routinely earn higher total and abnormal re- 
turns around earnings announcements than do sophisticated investors De, primarily 
the large-firm investors). This results in an implausible conclusion that, on average, in- 
vestor unsophistication yields positive abnormal returns. 


Further Evidence: All-Firm Results ` 


A central issue is whether the PR variable proxies for investor sophistication, or 
whether it captures a version of the anomalous size effect we observe in section 2 for 
firms in general. We first estimate the relation between PR and abnormal returns for 
the population of firm quarters. The hypothesis is that PR is associated with abnormal 
returns, without conditioning on the existence or magnitude of events (notably, swap 
gains) that might mislead unsophisticated investors. 

We estimate the following regression, which is similar to Hand’s (1990) equation (2): 


H. pg, +,R,.,+a3UE,+04PR,+ Ers i (3) 
dij | | 


,=(total or market-adjusted) return gt a 7-day period, including the earnings e an- 
nouncement (day 0) on COMPUSTAT for firm-quarter i, 

UE,=E,—E,-«, where E, is earnings before extraordinary items and Ge 
operations for quarter q, Gatien by the market value of equity at the begin- 
ning of the quarter, , | 

D CRSP equal-weighted market return for the 7-day period, SH 

PR,=[log(max MV) —log(MV,)]/[log(max MV)—log(min MVIL max MV = $99 bil- 
lion, min MV=$1 million, and MV,= market value of SC at the EEN 
of firm-quarter q for firm i, in millions. 


We exclude firms with market capitalization under $1 million, or with EE 
earnings (assuming a seasonal random walk model) exceeding 100 percent in absolute 
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. Table 4 


Regression of Earnings Announcement Period Returns on the} Market Return, 
| Unexpected Earnings, and Transformed Size: Ordinary 2 
, Least Squares Analysis of NYSE-AMEX Firms 


R=, be dack a,UE, + OPP, cker ` 





Return 
Return Window a; © Ou i as. a, Adjusted 
Metric e (t-statistic) (t-statistic) (t-statistic) (t-statistic) - R? 
Total Return ` ~1to+1 -0.0005 1.00 0.04 0.08 | 0.14 
` (—1.73) (0.10) (18.85) (11.88) 
—ito +1 0.0004 0.03 ` 0.07 ` 0.01 
(1.23) (18.72) (13.13) . 
~—1to0 0.0000 1.04 0.07 0.04 0.17 
.{0.02) (3.65): (25.64) (10.41) 
~1to0 0.0004 Om ` 0.05 0.01 
(1.68) > (21.84) (11.92) T 
0 0.0000 1.05 _ 0.08 0.03 ‘0.11 
(—0.14) (3.58) (25.55) (7.36) 
0- 0.0003 -0.08 0.03 0.01 
(1.63) (23.12) (8.36) 
Market-Adjusted —1to +1 — 0.0005 0.04 0.06 0.01 
Return’ : {-—1.71) - (18.84) (11.88) S 
o —~1t00 0.0000 0.07 0.04 . 0.02 
' (0.08) (25.49) (10.50) . 
0 — 0.0000 -0.08 ` 0.03 0.01 
(—0.08) (25.60) | (7.43) : 


e The sample is 49,864 NYSE-AMEX firm-quarter announcements in 1980-1988. R,,=return over a See 
period including the earnings announcement date (day 0); UE, =(En— Ena) MV,, where E, is earnings for 
firm i in quarter q and MV, is market value of equity: at the beginning of fiscal quarter q, in millions of dollars 
(all observations with |UE,|>100% are deleted); Rw. CRSP equal-weighted market return for the r-day 
periods PR.=[log(max MV)— log(M Vlog me MV)- -log(min MV), max MV=$99 billion, min MV= $1 


V Market adjustod returns are Ra~ Rar 


vale: to avoid excessive iaca on the parameter estimates, This reduces the sample 
size by 2.6 percent to 49,864.” : 
Does PR Measure Investor Sophistication? Results of estimating an OLS regression 
model of the relation between earnings announcement period returns and UE and PR 
are reported in table 4.'* The first row reports results using three-day total returns. To 
assess the robustness of the findings, we report results from alternative return metrics, 
return — and estimation methods in other rows of the table. 


V The get are similar when fewer observations (e.g., if | UE,| > 200 percent) are excluded, consistent 
with Brown et al. (1987a). 

D The adjusted His of the regressions reported in table 4 are considerably smaller than those reported in 
table 1. We obtain higher R*s in table 1 because regresstons are estimated using data on portfolios consisting of 
stocks announcing on a common calendar date. We use firm-specific data in table 4 because unex- 
pected earnings is also included as an independent variable. ; 


` 734 ns ZE WW f The Accounting Review, October 1991. ` 


The coefficient on Rw, 1.00, estimates the cross-sectional average beta of the sample 


stocks. The t-statistic that it equals 1 is-0.10 which means that the average relative risk. 


of the sample stocks is indistinguishable from the market portfolio’ s relative risk. The 
coefficient on UE, 0.04 (t-statistic = 18.84), reliably exceeds zero, but-in absolute terms 
it is small. and suggests that the seasonal random-walk earnings expectation proxy is ` 
_ noisy, given the three-day return window (see, e.g., Brown et al. 1987b). l 
The coefficient of 0.063 on PR is significantly positive (t-statistic = 11.86). Thus, PR 
_ is positively related to earnings-announcement-period returns regardless of earnings 
news (after partialling out the-effect-of the earnings variable). This result questions the 
validity of the PR variable as a proxy for investor sophistication, since it would imply 
that unsophisticated investors routinely ¢ earn more than sophisticated investors at earn- 
_ ings announcements. . 

Specification Checks. Regressions without ER as an independent variable are 
reported in the second row of table 4. The coefficient on PR increases. slightly to 0.07 
(t-statistic = 13.13). We conjecture the increase is due to size proxying for the effect of 
` relative risk on returns. The next four rows of table-4 reveal robust findings in connec- 
tion with the return window. The PR variable is always significantly positive. We also 


_. estimate all regressions in table 4 using: weighted least squares, weighting observations 


by the inverse of their standard deviation of daily returns, with virtually unchanged 
results. 


8 Further Evidence: Analysis of Hand’s Swap-Firm Data l 


We next address the swap-gain firms themselves. Abnormal returns in the-two-day 
quarterly earnings announcement period are estimated from the market model, fitted 
` over the 300.days after the announcement (see Hand 1990), using the CRSP equal- 
weighted daily return index. Unexpected earnings (UE) are calculated using a seasonal 
random-walk model, with the swap-gain subtracted from reported earnings. In confor- _ 
mance with Hand (1990), PR is calculated using his log-linear transform of market value 
at the end of December 1982, and UE and SGAIN are deflated by the market value of 
equity at the beginning of the two-day announcement period. Because Hand’s OLS and 
weighted least squares results are virtually indistinguishable, we report OLS results 
_only. We analyze those 223 of Hand’s 239 firms for which we could obtain complete: re- 
turn data.” ` 

Table 5 reports estimates from the following onda version of equation (3): 


AR, =a, +0,UE,+ a3SGAIN, + a4(PR,X SGAIN,) + asPR,+ e. @ 
_ When PR is not included, the results are very similar to those in Hand (1990, table 


3). In the first row, the coefficients on UE and SGAIN are significantly positive. When ` 


- SGAIN-and (PR x SGAIN) are both included (second row), neither coefficient is signifi- 
cant, in part because the two variables are highly collinear. In the third row, the coeffi- ` 
cient on the (PR x SGAIN) variable is 0. 83 (t-statistic = 2. 17). This is consistent with the 
EFFH. 

e Itis important to note that the iinet on PR is not biased upward because of the noise in ee 
UE. The reason is that PR, which is a transformation of firm size, is included in the regression by itself; not as an 
interaction with UE. Shevlin and Shores’ (1990) detailed examination of this issue yields the same conclusion. 


x Of the 16 excluded firms, five are listed on NASDAQ. We further exclude four observations because of 
‘their extreme influence on the regression parameters, ag suggested buy Oe SES tests. ` 
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Table 5 | 


Repression of Earnings Announcement Period Returns on the Swap Gain, ` 
Unexpected and the Probability that the Marginal Investor is | 
Unsophisticated : Ordinary Least Squares Analysis of Hand (1990) Data 
_ AR,=a;+.4,UE,+ asSGAIN, +0 dPR, x SGAIN,)+ asPR,+¢,* 








, i a: f dı l GQ; ZE aa f , Og Adjusted 
(t-statistic) . . (tstatisti) .  (tetatistic) (t-statistic) (t-statistic)... E 
90027 "om ` 048 0° os 9.0844 

(0.98) Gen © -> (2.09); E E 
0.0035 ` wan ` —0.01 OBB a -` 0.0313 
(1.12) Gan ->> (~0.01) Dan S T 
0.0038 - O20- " | | 0.83 ` ao 0.0358 `- 
Hä `. “79 l (2.17) e? 
~0.0084 ` . 0.17 ffe 0.036 - 0:0319 
Am Ga oo Go | 
` -0.0057 ' 0.19 D l 0.64- 0.024 0.0379 
(0.71) = ema 8 (1.53) © aag y 
- — 0.0065 019 - 0.23 - °° 024° ` 0.025 “0.0336 
Laag (2.78) (0.25) ` . (0.15) , (1.24) - 


-  * The sample consists of 219 of the 239 firm-quarter observati ations included in Hand (1 EC 
reported debt-to-equity swaps sometime between 1981 and 1984. AR, is two-day market model prediction 
error; UE, =(E„— En) MV., where E,, is earnings before extraordinary items and discontinued operations 

and the debt-to-equity ‘swap gains’ for quarter q, and MV, is the market value of equity at the beginning of 
fiscal quarter q, in milHons:of dollars; eae taeda from debt-to-equity swaps in millions: of dollars 
divided by the market value of equity at the beginning of the swap fiscal quarter; PR. =[log(max: MV) 
"ag (MV, ]}[log (max MV)—log (min MVJ], max MV=$58 billions, min MV=$5 million. 


Hand. abeo) does not report the effect of PR alone on abnormal returns. Results in 
the fourth row reveal that the coefficient on PR is 0.036, which is significant by itself 
(t=1.95, one-tailed p-value<0.05). Thus, abnormal returns at swap-gain quarters’ 
earnings announcements did increase with firm size, which is consistent with the re- 
sults for the NYSE-AMEX population reported earlier. When (PR x SGAIN) and PR are 
included simultaneously in the regression (fifth row), neither coefficient is significant, 
in part because of their collinearity. Finally, when. all four variables in equation (4) are 
included simultaneously (sixth row), only the coefficient on earnings is significantly . 
positive. 

The magnitude of the coefficient on PR is comparable to that of the coefficient esti- 
mated for all firms and quarters in general (see table 4). The similarity of the coefficient 
magnitude on PR in tables d end 5 makes it less likely that the security return behavior. 
around the — quarters is indicative of the market’s extended functional fixa- 
tion,” 


21 The coefficient ‘seated for (PRXSGAIN) do is of interest. Hand (1990, 753-64) interprets the 
estimate in terms of joint hypotheses concerning the ability of unsophisticated investors to discern that the 
swap gain is a “one-off "event, the probability that the marginal investor is unsophisticated, and possible bias . . 

in PR as a proxy for that probability. The coefficient magnitude, however, cannot be used to be- 
tween alternative specifications of the functional fixation hypothesis because PR is an arbitrary transform of 
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IV. Conclusions 


We examine the existence and pattern of positive abnormal returns around earn- 
ings announcements. These .estimated abnormal returns could indicate: market inoeffi- 
_ ciency; inadequacy of the CAPM or of the index of security returns (Ball 1978); risk 
changes that are not captured by our research design; tax effects (including capital 
_ gains taxes); a variation of Keim’s (1989) trading-mechanism bias due to trading be- 
- havior around earnings announcements; and a chance result that, during the sample 
period, on average, more good news than bad news was released via earnings an- 
nouncements for firms in general and for small firms in particular.2 ` 

We use these results to reexamine Hand’s (1990) research on functional fixation. 

We assess the theory as not sufficiently well-specified to predict cross-sectional varia- 
tion in the price response to swap gains as a function of variation in investor sophistica- 
tion. It is particularly unable to predict such variation as a function of size, which is 
what Hand’s empirical research essentially tests. We analyze almost the entire popula- 
tion of NYSE-AMEX firms around their quarterly earnings announcements during a 
nine-year period overlapping the period examined by Hand. We also analyze his swap- 
gain firms. We conclude that Hand’s PR variable (the probability that the marginal in- 
vestor is unsophisticated) is in effect a proxy for an anomalous size effect at earnings 
announcements. Although, while this does not prove that PR fulfills an identical role 
for the swap-gain firms in the swap quarters, as it appears to for the population of firm 
quarters, it does raise serious doubts about whether the anomaly uncovered by Hand is 
explained by his. hypothesis. _ 
-This is far from a complete explanation of the phenomenon. There are no credible 
theories to predict a systematic relation between size and abnormal returns at earnings 
announcements. The pizereited return anomalies at earnings announcements await a 
credible explanation. 


size. The ‘scalar (9.4) is the estimated range. of NYSE-AMEX firm sizes, with “IBM taken as agave 
log, ($58,000)=11.0 and minlv was: arbitrarily (but prior to empirical analysis) set equal to log,($5)=1.6.” 
` Assuming the smallest firm to-be capitalized at $1 million, rather than $5 million, would scale the sample { PR,} 
values oe 11.0, increasing the regression coefficient for (PR x SGAIN) by 22.2 percent. The coefficient has no’ 
natural scale, so inferences on the basis of its magnitude are invalid. In the case of the number of institutional 
investors—the second proxy—Hand (1990; 761) arbitrarily sets the minimum value at 0.69 [=log,(2)]. If the 
minimum ‘number of institutional investors had been assumed to be (say) unity, which does not seem an 
economically important change from 2, then the coefficient on this variant of PR numbers would have been 10 
‘percent larger. To demonstrate how arbitrary is the scale of PR, and thus the regression slope on PRXSGAIN, ` 
note that there are many small firms with no institutional investors and that the log of zero is minus infinity. l 
Meer 
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Extended Functional Fixation and 
Security Returns Around Earnings 
Announcements: A Reply to 
-Ball and Kothari 


John R. M. Hand 
University of Chicago 


HE purpose of this article is to reply to Ball and Kothari’s (1991) insightful cri- 

. tique of “A Test of the Extended Functional Fixation Hypothesis” (Hand 1990). 

Fundamentally, Ball and Kothari argue that the effect I observe in the data is in- 

_ distinguishable from the firm-size effect. Using several alternative approaches, I show 

empirically that their conclusion is premature. I also present some suggestions for 
future work in the area of size-related anomalies in security returns. 


LA Reply to Ball and Kothari (1991) - 


Ball and Kothari examine the relations between CAPM risk, excess stock returns, 
and firm-size in the days around quarterly earnings announcements. Their objective is 
to test various hypotheses that might explain why prior research has found positive 
average excess returns associated with these information releases. They conclude that 
none of the hypotheses they examine explains this anomaly; moreover, they rigorously 
document what appears to be a second anomaly, namely a reliably negative relation be- 
tween firm-size and excess stock returns at quarterly earnings announcements. _ 

In light of this finding of a general firm-size effect, they reexamine the extended 
functional fixation hypothesis proposed in Hand (1990) from a number of different as- 
pects. Among other criticisms, they are skeptical about the intuition underlying the ex- 
tended functional fixation hypothesis, and whether it can predict the empirical results 
reproduced here in table 1: I agree in principle with many of their points, especially 
their misgivings about the theory. Nevertheless, I noted in Hand (1990, 760) that: 
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Table 1 
Cross-Sectional Regressions of the Two-Day Excess Stock Return 
at the Swap Quarter’s Earnings Announcement (A,,)* 
MAXIMUM LIKELIHOOD on: Aw=b,+b,PR,;+ b3p,.5GAIN,;+ bUZ3 +I t En 
Percent metric; All observations included would be n=233 
h: Null 
Line Hypothesis b, (%) ba bs ba R õn) K-S n 
(1) EFFH, 0.07 f 0.85 0.21 0.038 3.03 0.90 232 
(0.27) (2.17) (2.61). (3.90) (0.40) 
(2) EFFH: 0.01 0.95 0.27 0.036 3.07 0.80 231 
(0.05) (2.39) (2.63) (3.93) (0.55) 
(3) BK Size —0.58 0.025. 0.17 0.021 3.14 0.78 233 
using E:(Z,) (-—0.93) (1.68) (2.13) (3.86) (0.58) 
(4) BK Size —0.61 0.025 0.24 0.024 3.17 0.73 232 
using Ex(Z,) (—0.98) (1.71) (2.40) (3.99) (0.66) 
(5) EFFH, —0.32 0.011 0.73 0.21 0.035 3.03 0.91 232 
vs. Size (—0.51) (0.68) (1.70) (2.80) (3.90) (0.38) 
(6) EFFH, | — 0.36 0.011 0.84 0.26 0.032 3.06 0.75 . 231 
vs. Size (—1.58) (0.68) (1.92) (2.61) (3.93) (0.63) 


* The variables used are: measures of unexpected earnings (UZ}), an interaction between-Hand’s (1990) 
proxy for the probability that the marginal investor pricing the firm’s stock was an unsophisticated investor 
(Pa) and the swap gain (SGAIN,), and Ball and Kothar!i’s (1991) size-effect proxy for the same probability (PR,). 


Notes: 1. 


2. 


EFFH, is the extended functional fixation hypothesis under the assumption that unsophisticated 
investors’ expectations of earnings E“(Z,) in the swap quarter are given by Eat A HA whereas 
EFFH, uses Es(Z,)=Z.1. 

Pw is the percentage of firm i's common stock held by noninstitutional holders at the end of the 
GC prior to the swap quarter’ s earnings announcement (Source: Standard & Poor's Stock 
Guide}. 


. PR, is a log-linear, scaled transformation of firm-size m,„ determined two days prior to the swap 


quarter’s earnings announcement. 


, Both UZ} and SGAIN, are dollar amounts deflated by firm-size M... 
. K-S is the Kolmogorov-Smirnov Z-statistic that the fitted residuals deflated by fax (62462,}°- ar are 


~id N(0, 1). 

One SGAIN, outlier lay 7.0 standard deviations from the mean of the distribution of Î-'SGAIN,, 

a one UZ: outlier lay 11.2 standard deviations from the mean of the distribution of 
Vd OY Age S 


. t-statistics with respect to a null value of zero are in parentheses,. except for the K—S Z-statistic, 


where it denotes the two-tailed p-value. 


. Null: ¥,~iid NI, 02); €.~indep N(0, of); cov(#., Exu) =0 VI, k. 


[T]he (extended functional fixation) model . . . is not intended to be a rigorous one in the 


strict microeconomic sense, and so several interesting theoretical issues concerning the 
equilibrium characteristics of markets containing both sophisticated and unsophisti- 
cated investors remain unexplored. For the moment, as with any new theory and 
especially one tested in such a single circumstance, the extended functional fixation hy- 
pothesis should be viewed with healthy skepticism. 


Ultimately, however, the credibility of the extended functional fixation hypothesis 
depends on how well its predictions correspond with reality, as represented by empiri- 
cal findings. It is this correspondence that is at issue. Fundamentally, Ball and Kothari 
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(1991, 730) consider that "the effect that Hand observes in his data is indistinguishable 
from the firm-size effect.” The “Hand-effect” to which they refer is the result that the . 
coefficient estimate â, on the variable p,,SGAIN, reproduced in lines (1) and (2) of table 
1 is reliably positive. The term p,,SGAIN, is the interaction between two sub-variables. . 
The first is my proxy for Pæ, the probability that the marginal investor in firm Ce stock 
at the swap quarter’s earnings announcement time t=w is an unsophisticated investor, ` 
This proxy is denoted by p,,, and is taken to be the proportion of firm i’s stock held by 
noninstitutional holders at the end of the month prior to the swap quarter’s earnings 
announcement. p,, is a “natural” proxy relative to two others described in Appendix B 
of Hand (1990) in the sense that it is automatically scaled to lie between zero and one." It 
also best captures the intuition that the more of a firm’s stock held by unsophisticated 
investors, the more likely it is that such investors will determine the stock price. The 
second term is the swap gain deflated by firm-size just prior to the swap quarter's 
earnings announcement, denoted by SGAIN,. Under the efficient market hypothesis, as 
_ is predicted to be zero. Under the extended functional fixation hypothesis, a; ig . 

predicted to be positive? As shown in table 1, under either of two alternate 
specifications of unexpected earnings UZ,, the coefficient estimate on p,.SGAIN, is 
reliably positive. This result is inconsistent with the efficient market hypothesis, but ` 
consistent with the extended functional fixation hypothesis. 

Bell and Kothari arrive at their conclusion that Hand’s (1990) effect is indistin- 

guishable from the general firm-size effect via the following steps: ` 


1, They report in their table 4 that an alternative proxy for Pæ derived from the 
natural logarithm of firm-size is reliably positively correlated with excess stock 
returns at earnings announcements in general. In Appendix B of Hand (1990), I 

_ ‘denote this junior proxy D" Bell end Kothari denote the same by PR, 

3. They note that in Appendix B of Hand (1990), Paw and Paw are positively corre- 
lated (product moment correlation of 0.30 for n=233) and that, insofar as the 
magnitude and inferences relating to â, are concerned, using Paw (viz. PR,) in 
place of p. yields very similar results to those reported in table 3 of Hand (1990, 

~ 758). 

3. Because Ĥa and PR, are positively correlated, Ball and Kothari Steet to work 
solely with PR, in their reanalysis of the cross-sectional relations between excess 
stock returns at the swap quarters’ earnings announcements and potential Qx- 

- planatory variables. 

4.. In line (4) of their table 5, they report that the effect of PR, alone on excess stock 
-returns at the swap quarters’ earnings announcements is reliably positive. Lines 
(3) and (4) of table 1 here corroborate their result; under either of two alternate 
specifications of unexpected earnings UZ,, the estimated coefficient b, on PR, is 

reliably positive by a one-tailed test Thus, they conclude, excess stock returns 


+ 


1 As Ball and Kothari correctly note, the two more EE in Appendia of Hand (1890) ` 
require potentially arbitrary scaling assumptions, without which there is no guarantee that the probability 
proxy will lie between zero and one. 

2 Equations. (21) and (22) of Hand (1990, 784) give details as to the predicted magnitude of a). If 
unsophisticated investors see the swap gain as a real one-time amount, then a;=1. If instead these same im 
vestors see the swap gain as a real regular amount, then ge AS, the cross-sectional mean of the parameter 
mapping ats earnings into returns, Lp 
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at swap quarters’ earnings announcements decrease with firm-size, as per the 
general size-effect finding for the population of earnings announcements. 

5. Last, they note that the magnitude of the estimated coefficient on PR, at swap 
quarters’ earnings announcements is comparable to that found at earnings an- 
nouncements in general, and that when both PR,SGAIN, and PR, are included 
simultaneously in the regression, neither coefficient on the two variables is reli- 
ably different from zero. Ball and Kothari therefore conclude that their findings 
imply that “the effect that Hand (1990) observes in his data is EE 
from the firm-size effect.” 


This reasoning sounds persuasive, but it is less well-founded than it appears. I 
suggest five reasons why Ball and Kothari’s conclusion that the Hand-effect is indistin- 
guishable from the firm-size effect they document is premature. 

First, merely because p,, and PR, are positively correlated does not imply that the 
two proxies are equivalent and interthangeable. Suppose that both p,,SGAIN, and PR, 
are imperfect proxies for some unobserved variable X, that truly explains the target por- 
tion of cross-sectional variation in excess stock returns at swap quarters’ earnings an- 
nouncements. Ball and Kothari assume that PR, is at least as good a proxy for X, as is 
DwSGAIN, on the basis of a positive correlation between PR, and p,,. However, they do 
not test the veracity of this assumption. Lines (5) and (6) of table 1 are such a test. 

In lines (5) and (6) of table 1, the regression in Hand (1990) is augmented to include 
PR, as an additional explanatory variable; equivalently, Ball and Kothari’s (1991) regres- 
sion is augmented to include p,,SGAIN, as an additional explanatory variable. Thus, 
Hand’s Pa SGAIN, candidate variable is put into direct competition with Ball and 
Kothari’s PR, candidate variable. The results reported in lines (5) and (6) show that 
under either of two alternate specifications of unexpected earnings UZ,, the estimated 
coefficient b, on PR, is not reliably different from zero, but the coefficient estimate b, 
on p,.SGAIN, is reliably different from zero, by a one-tailed Liest? | 

Second, Ball and Kothari demonstrate in their tables 1 and 2 that mean excess stock 
returns at earnings announcements in general are zero for the largest firm-size decile, 
and about 1 percent for the smallest firm-size decile. The simple ordinary least squares 
(OLS) mean excess stock return at the swap quarter’s earnings two-day announcement 
is 0.53 percent (cross-sectional t-statistic of 2.45). In principle, this 0.53 percent could 
be due to a firm-size effect, but only if a substantial fraction of the swapping firms were 
from the very smallest firm-size deciles. However, as of just prior to the swap quarter’s 
earnings announcement, the smallest firm-size decile of the 233 swapping firms was 
the fifth decile of the NYSE + AMEX population, and the median firm-size decile of the 
swapping firms was the ninth decile! Moreover, virtually identical firm-size decile 
membership patterns exist for each of the preceding and subsequent four quarterly 
earnings announcements. 

Third, if the Hand-effect is indistinguishable from the general firm-size effect, then 
one would expect similar positive mean excess stock returns to be found at surrounding 
quarterly earnings announcements. The results reported in table 2 suggest that this is 


? Identical inferences obtained if OLS is used to estimate b, and b3. For example, in line (6), using OLS 
yields b,=0.020 (t-statistic of 1.08) and b,;=1.00 (t-statistic of 2.31). Not deleting any outliers also yields 
identical inferences; for example, in line (6), with n= 233, b,=0.010 (t-statistic of 0.62) and b,=0.79 (t-statistic 
__-OF 2.06). 
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Table 2 
Estimates of the Cross-Sectional Mean Two-Day Abnormal Common 
Stock Return (c,,) for those Quarterly Earnings Announcement 
Dates r¢[—4, +4] Event-Quarters from the Swap Quarter’s 
Earnings Announcement (r= 0), and Other Key Statistics: 
Percent Metric 


MAXIMUM LIKELIHOOD on: A, steck Jr + Er 
Null: a, ~ tid NI, 02,); @,~indep N(0, 02}; cov(Hn> Er) =0 Vi, k, T. 


Quarter ` Ki Mean Median Var{A,) % A. Die Mean A, — 

(8 obs.] Cre fin 97] Ap Ar [in %7] >0 Med A. Med A, 

T=—4 0.02 3.33 —0.13 0.02 10.8 51.1 0.00 — 0.15 
[225] (0.12) (4.18)  {—0.60) (0.33) 

72-3 —0.17 4.23 . —0.21 —0.13 11.3 47.1 —0.04 — 0.08 
[227] (—0.81) (4.64) . (—0.96) (— 0.88) 

r=-2 ~—0,19 4.29 —0.18 —0.20 10.8 48.1 0.01 0.04 
[233] (—0.91) (4.92)  (—0.73) (— 0.59) 

r= —1 0.16 3.82 0.07 0.05 11.3 51.9 0.11 0.02 
[231] (0.73) (4.32) (0.33) (0.59) 
7=0 = 0.38 3.33 0.53 0.19 10.9 54.5 0.17 0.34 
[233] (1.84) (4.07) (2.45) (1.38) 

ze +1 0.05 3.31 0.15 0.07 11.3 61.5 — 0.02 0.08 
[233] (0.25) (4.01) (0.69) (0.48) l 
T=+2 0.00 4.41 0.08 0.03 10.8 51.1 — 0.03 0.05 
[229] (0.02) (4.84) (0.38) (0.33) 

EE 0.09 3.47 . 0.05 0.08 9.8 52.9 00 ` — 0.03 
[227] (0.45) (4.49) (0.24) (0.86) 

r=+4 0.10 1.29 0.09  —0.01 8.5 49.8 0.11 0.10 
[221] (0.62) (2.32) (0.48) (—0.07) 


Notes: 1. Two-day common stock prediction errors A,, are from a market model, and are calculated over the 
window (—1, 0), where (0) is the day that the earnings announcement appears in The Wall Street 
Journal Index, except for swap quarter 0. In swap quarter 0, day ( 0) is the day after the earnings 
announcement was reported by the Dow Jones News Retrieval’s/ ‘TEXT feature as having come 
over the Broad Tape. If the time-stamp was 24 p.m. EST, then day (0) was taken to be two days 
later. Market model parameters are estimated over the period (+2, +301). 
2. t-statistics with respect to a null value of zero are in parentheses, except for the “% of A, positive” 
statistic, where the null value is 50 percent. 


not the case. Table 2 presents both maximum likelihood and OLS estimates of the mean 
excess stock return at each of the four preceding and subsequent quarterly earnings an- 
nouncements in event time. When either estimation method is used, the mean excess 
stock return at the swap quarter’s earnings announcement 7=0 is reliably positive by a 
one-tailed test, yet in all other event quarters the mean excess stock return is not 
reliably different from zero. 

Fourth, under the extended functional fixation hypothesis, all else held equal one 
would predict that excess stock returns at swap quarters’ earnings announcements will 
be skewed to the right relative-to those at surrounding quarterly earnings announce- 
ments. This is because the swap gain is almost always positive, and has a much smaller - 
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cross-sectional variance than that of excess steck returns at typical non-swap quarters’ 


' earnings announcements. Thus, if unsophisticated investors. react to the swap gain at 


_ the swap quarter’s earnings announcement, and if their reaction increases the firm’s 
stock price, then the typical mean zero, high-variance excess stock return distribution 
will be overlaid with a positive mean, low-variance’ distribution. This will result in an 
overall distribution of excess stock returns at swap quarters’ earnings announcements 
that is skewed fo the right relative to that at typical non-swap quarters’ earnings 
announcements. The rightmost two columns of table 2 report the differences between 
the maximum likelihood and as estimates of the mean and median excess stock return 
at different’ event-time earnings announcements. Both metrics demonstrate that by this 
measure of skewness, the excess stock returns at the swap quarter’s earnings 
announcement are the most skewed to the right out of all nine event quarters.* 

Fifth, if the Hand-effect is indistinguishable from Ball and Kothari’s general firm- 
size effect, then one would presume that there would be reliable firm-size effects 
present at both the swap -quarter’s earnings announcement, and the surrounding - 
quarter’s earnings announcements (else the general firm-size effect is not so general 
after all). Panel A of table 3 reports the coefficient on PR, in a univariate cross- 
sectional regression of the two-day excess stock return A, on PR, for each event- 
quarter r€[—4,+4].° The results indicate that only at r=0 and 7= +1 are there any 
` indications of-a nonzero firm-size-effect per se. Atthese.quarters the magnitude of the 
coefficient on PR, is similar to that found in the market-adjusted returns regressions ` 
reported in Ball and Kothari’s table 4. However, in the surrounding seven other event- 
quarters there is no indication of a reliable firm-size effect. 

Under the extended functional ‘fixation hypothesis, as applied to EN ‘swapping 
firms’ experimental design situation, the coefficient on p,,SGAIN, is predicted to be 
positive only at the swap quarter’s earnings announcement, and-not at any surrounding 
. quarterly earnings announcements,® Panel B reports the coefficient on p,,SGAIN, ap- 
proximated by assuming that pa Dat for all r#0. The results are consistent with the ex- 
tended functional fixation hypothesis; only at the swap quarter’s earnings announce- 
ment is the-coefficient.on p»SGAIN, reliably positive, both by cross-sectional t-statistics 
and by a time-series t-statistic based on the time-series standard deviation of Cc, 
| excluding 7=0 (latter t-statistic is 3.2). Panel C reports the coefficients on both PH. and 
DoSGAIN, when both are included in the regressions run in panels A and B. Panel Ce 
results indicate that the inferences obtained separately from panels A and B are robust 
to any cross-correlation between DR. and paSGAIN;. _ 

Taken together, these results indicate that the conclusion reached by Ball and 
Kothari that the-Hand-effect is indistinguishable from the general firm-size effect they 
document at quarterly earnings announcements is premature, ` ` 


. 4 Classical tests of symmetry (Gupta 1967) yleld formal statistics that support this inference. . 

* These regressions exclude any control for unexpected earnings UZ,,. This omission is innocuous since 
under the efficient market hypothesis, PR, must be uncorrelated with unexpected sarnings. 

* This is because the experimental design aligns the swapping firms at r=0 by virtue of that being the first 
- Fe-release of their swap gain in an earnings announcement, and the swap gains are almost all positive. At sur- 
rounding earnings announcements of the swapping firms, there.will be earnings components that unsophisti- 
` cated investors misreact to.and that therefore are captured in mispricing of the stock prices, but some such 
ear s gr died geal and some may be nega The GER sign on Pr SGN at T+0 

‘ore be zero. 
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Table 3 | 
Cross-Sectional Regressions of the Two-Day Excess Steck Sa LA.) for 
those Quarterly Earnings Announcements rE[— 4, +4) Event-Quarters 
from the Swap Quarter’s Earnings Announcement (r= 0)* 





Panel A. Maximum Likelihood on: A, —d 1 dPR. A8. +2: 


Quarter: r —4 -ő -2 — 1 0 +1 +2 +3 +4 
d, -0,014 0.008 — 0.009 —0.003° 0.024 0.027 0.012 0.005 0.004 


(t-stat)  (—1.0) (0.6) (+06) (-02) (1.8) (1.8) ma Dä (-0.3) ` 


Panel B. Maximum Likelihood on: A,,=d,+d3PpoSGAIN, +4.+ Ex: | 
Quarte:r 4 eg -2 =i 0 +1 +2 +3 +4 


d -0.09 018  —0.17 0.18 089 Däi 0.44 0.33 -0.30 
(t-stat) (-0.3) (~0.6) (—0.5) (0.4) (2.5) (-06) (10) (0.7)  (~0.7) 


Panel C, Maximum Likelihood on: A,,=d,+d,PR,,+dspoSGAIN, A8. + Er: 


Quarter ` ~4 -3 —2 -1 0 +1 +42 +3 +4 

d -0.015 0.015 -0.007 000 0.010 0.036 0.006 0.001 -—0.001 
(t-stat)  (—1.0) (0.9)  (-0.4) (-04) (0.6) (2.2) (0.3) (0.0) (—6.0) 
d, 0.08 -0.32 —0.10 0.23 0.78 ~0.61 0.37 0.32 —0.30 
(t-stat) (0.2) (-0.9)  (-0.3) (0.6) (2.0) {-14) (08) (06) (0.6) 


* The variables used are Ball and Kothari’s (1991) size-effect proxy (PR,,), and the interaction between 
Hand’s (1990) proxy for the probability that the marginal investor is an unsophisticated investor { pa) and the 
_ Swap gain (SGAIN))., 

` Notes: 1, Two-day common stock prediction errors A, are from a market model, and are calculated over the 

window (—1, 0), where ( 0) is the day that the earnings announcement appears in The Wall Street 
Journal Index, except for in the swap quarter 0. In the swap quarter 0, day ( 0} is the day after the 
earnings announcement was reported by the Dow Jones News Retrieval’s/ ‘TEXT feature as having 
come over the Broad Tape. If the time-stamp was 24 p.m. EST, then day ( 0} was taken to be two 
days later. Market model parameters are estimated over the period (+2, +301). . 

2. P» is the percentage of firm i’s stock held by noninstitutional holders at the end of the month prior 
to quarter 0's earnings announcement (Source: Standard & Poor’s Stock Guide}. 

3. PR, is a log-linear, scaled transformation of firm size determined just prior to event-quarter 7's 
earnings announcement. 


II. Some Suggestions for Future Research 


In the last ten years a number of puzzling anomalies have been discovered by 
capital markets researchers in accounting and finance, many of which appear to be 
cross-sectionally related to firm-size. What should accounting researchers make of all 
this turmoil, and what research paths appear the most promising? 

At a minimum, firm-size anomalies indicate that relatively little is yet known about 
the factors that determine security prices both cross-sectionally and over time. 
Traditional pricing models have yielded useful approximations, but as more is under- 
stood about the deviations between the predictions of such models and empirical 
reality, it seems likely that researchers will need to be increasingly open to nontradi- 
tional assumptions, models, and test methods. 
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As an example, consider the role of unsophisticated versus sophisticated investors 
with respect to how quickly, and unbiasedly, accounting information is-reflected in 
security prices. Relatively little is known about how the interactions of these two types ` 
of investors might affect security prices, both conditionally with respect to accounting 
information and unconditionally. Noise-trader models that have recently appeared in 
the finance literature may present useful theoretical starting points (DeLong et al. 1989, 
1990). Experimental asset markets may provide an additional empirical method by 
which the effects of cross-sectional and intertemporal differences in the mix and abili- 
ties of sophisticated and unsophisticated investor populations on security prices can be 
_ feasibly examined. For example, the anomalies documented in the watershed capital 


'. markets research conducted by Bernard and Thomas (1989, 1990) are more severe for 


_ smaller firms. One potential explanation that relates to the extended functional fixation 

hypothesis and the study by Ball and Kothari (1991) is that the firm-size correlations are 
in fact proxying for the impacts of cross-sectionally varying concentrations of sophisti- 
cated and unsophisticated investors. Although this conjecture can be readily tested 
with a quasi-experimental capital markets design, it would be interesting to cross-vali- 
date the results with similar and even more in-depth tests conducted through 
experimental asset markets studies. 

In summary, I appreciate and have benefited from the competition and rigor 
` induced by. the quality and insight of the Ball and Kothari (1991) study. My hope for 
accounting: research in the next ten years is that such competition and diversity of 
opinion will lead ro a broader and deeper understanding of the important accounting 
issues facing us. 
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The Value of Private Pre-Decision 
Information in a Principal-Agent 
Context 
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SYNOPSIS. AND INTRODUCTION: The information furnished ` by 
management accounting systems aids top management in assessing the 
performance of lower levels and in setting proper incentives. These systems 
also provide information to lower levels which aids them in making 
operational decisions. Typically, detailed information is provided to lower 
levels in the organization, while only a summary of this information is 
furnished to top management. Therefore, in designing a management 
accounting system, a question arises as to the welfare effect of giving an 
employee access to detailed information,-on which he can base his 
decisions, when such (detailed) information cannot be used in evaluating 
his performance.’ More generally, the question arises as to the welfare 
effects of increasing the informational asymmetry between upper and lower 
management by improving lower management's private pre-decision 
information system. Answering the above question can provide important. 
insights into the proper design of firms’ management accounting systems. 
We examine this issue using the principal-agent framework. 

. In any given period, information reported in the managerial accounting 
system may be pre-decision or post-decision. When the system reports 
post-decision information and the contracts are complete, it is clear that 


i Henceforth, information will be referred to as “public” when it can be used for performance evaluation, 
because it is observed by all the contracting parties, and “private” when it cannot be used for performance eval- 
uation because it is not observed by all the contracting parties. The term “pre-decision” information will be 
used to denote information on which individuals can base their decisions. The term “post-decision” informa- 
tion will denote information that cannot be used for decision making because it arrives after the decision has 
been implemented. 


This paper has benefited greatly from comments of the participants at the Arden House Workshop of 
Columbia University, April 27-29 1990, especially those of Ram Ramakrishnan, as well as those of two 
anonymous referees and especially those of the Associate Editor. 
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the value of such information is non-negative. 2 The value of pre-decision ` 
information is more difficult to assess. An agent who has access to better 
pre-decision Information is able to use that information to make better 
decisions, given his objectives. However, the agent's objectives and the — 
principal’s objectives need not be the same. For example, the agent may 
use his better pre-decision Information system to more sucessfully shirk on 
the job, making the principal strictly worse off. Thus, the principal is not- 
necessarily better off by improving the agent's pre-decision information 


em. l 
One way În which the principal can mitigate any. negative effects of 

improving the agent's private pre-decision information system is to require 
the agent to communicate the private Information. In this paper, we Ignore 
the possibility of such communication.? The reason for this Is, as noted 

earlier, while large amounts of detailed information are provided to 
individuals at lower levels of the firm, only a small amount of that informa- 
tion Is ever communicated to higher levels of the firm. Therefore, we view 
ignoring communication as an approximation. - - | 

We examine a principal-agent model in which the principal can 

Influence the extent to which the agent has superior private Information on ` 
which the latter can base his action choice. We find sufficient conditions ` 
under which a strict Pareto Improvement results from improving the 
agent's private pre-decision information system. This result arises because 
Improving the agent’s private pre-decision Information system leads to 
Improved coordination between the agent's Information signal and action 
choice, which, In tum, results in an increase in the agent's average 
productivity. Although we do not find sufficient conditions under which a 
strict negative value might arise, we discuss some possible reasons and 
illustrate them with examples. 


Key Words: Principal-agent model, Private pre-decision information, De- 
centralization, Management accounting systems. 


HE remainder of this paper is organized as follows. The relevant literature is 
reviewed in the next section. In section II we describe the basic agency model 
that will be studied. Section III contains the results of our analysis. Finally, the 
. results are summarized and future research is outlined in section IV. The appendix con- 
tains the proofs of the propositions. 


I. Literature Review 


-Baiman and) Evans III (1983) studies the value of improving an agent’s private pre- 
decision information system but only establishes sufficient conditions under which 
there is no welfare effect from ge so. Penno (1984), however, provides a positive 


2 See Rate Cae e for sufficient conditions under which the valus of public post-decision in- 
dee is strictly posi 

Research dealing SCH the value of communication includes Baiman and Evans (1983), Christensen 
. uden, Dye (1883), Melumad and Reichelstein (1989), Penno (1984), and Sivaramakrishnan (1990). . 
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there is no welfare effect from doing so. Penno (1984), however, provides a positive 
result. His result depends upon the production function being of the form f(x|af) 
where the action a€ [0,0] and:the signal ¢€[0,¢]. As a result, for ¢< $ > {where $,€(0,«}, 
e >0), it is not worthwhile for the principal to have the agent take a positive effort. 
Allowing.the agent to know whether {<{o or £>£o, thus leads to a reduction in the 
agent’s effort when it has no production effect and an increase when it has a positive 
production effect.. The proof underlying this result, however, does not EE to. 
other production settings. 

In contrast, Christensen (1981) and Penno (1989) present examples akero providing 
the agent with a private pre-decision information system reduces the agency’s welfare. 
The reason for Christensen’s result is that in his example: (1) when the agent has no | 
access to the private pre-decision information system, the output support is movable in 
the agent’s action and the first-best solution is. obtainable; (2) the information signal 
conveys no information about the agent’s marginal productivity and thus.the informed 
first-best solution is the same as the uninformed first-best solution; and (3) the agent - 
Cen use the private signal to choose his action so that the probability of being caught 

shirking is zero, thereby making the first-best solution no longer obtainable. Notice that 
the second point implies that there is no potential for a positive value of information in 
this example. Penno’s example depends-upon the solution to the agent’s problem being 
a corner point, both with and without the private information. | 

. Thus, the extant examples that demonstrate a negative value to the principal from 
endowing the agent with a private pre-decision information system appear to be some- 
what contrived. In particular, the private signal in the Christensen example conveys no 
marginal productivity information that would be used:by the principal even in a first- 
best setting. -. 

Finally, Demski. and Sappington (1987) and Lambert (1986) study iis in which 
the agent begins the game already endowed with the ability to choose a private informa- 
tion system. As a result, neither of these studies addresses the welfare effects of the 
principal endowing the agent with a private pre-decision information system. `° 

In the following, we examine two different contexts and find sufficient conditions - 
in each for there to be a welfare increase as a result of i improving a agent's private 
pre-decision. information system. , 


I. A General Model. 


In this section we present a general model with which to BC ge issue of im- 
proving the agent’s private pre-decision information system. The general model con- 
sists of a single agent, single principal, and single period. The risk-neutral principal 
hires the strictly risk- and work-averse agént:to supply effort. The firm’s production ’ 
function is p(x|a,y), where x is the outcome, o A is the agent’s effort, and yEY is . 
_some productivity parameter realization that is independent of a. Higher action choices 
and higher productivity parameter realizations move the distribution of x to the right in 
the sense of first-order stochastic dominance. The support of x is an interval on the real 
line and is fixed; that is, it is not movable in either a or y. The feasible set.of actions 
from which the agent chooses is an interval on the real line. The agent’s utility function 
is separable in wealth and effort, and the disutility of effort is V(a)=a.* | - 


a This is without loss of generality because, instead of the agent choosing a level of effort, a with disutility 
Viol Wert V( +) is strictly convex, the agent can directly choose a level of aes SCH with DS 
effort a= V~ {z}. 
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Next we specify the structure of the information system and the way in which we 
will measure its informativeness. We represent the density of the productivity param- 
eter realization (y) by ei, ) and the signal that the agent receives from the private pre- 
decision information system by ¥. We represent the probability distribution over x con- 
ditional on the signal received as q*(x|a, ¥=y), where 6 is a parameter representing the 
inforimativeness of (be agent’s private pre-decision information system. 

Informativeness is usually operationalized through the idea of randomization. It is 
assumed that the signal that is reported is picked from a probability distribution that is 
conditional on the realized productivity parameter. We model a more specific situation 
in which the agent’s private pre-decision information system may be in or out of 
control. When the system is in control, it reports the realized productivity parameter 
with a probability of 1. When the information system is out of control, we assume that it 
measures and records the productivity. parameter realization with error in such a way 
that its resulting report is pure noise. In particular, when the system is out of control, it 
chooses ‘and reports ‘a signal 7 with the same probability density as the unconditional 
density of the productivity parameter Tei, TI. Thus, the signal from the out-of-control 
information system conveys po information about the underlying productivity param- 
eter realization. The probability that the information system is in control (6) is 
independent of the productivity parameter realization and is common knowledge. Fur- 
ther, we assume that upon receiving the signal ¥, the agent Gre not know whether the 
information system is in or out of control. 

The informativeness of the agent’s private pre-decision mao See can 
then be parameterized by the probability (5) that the system is in control. Thus, the signal 
generated by a 6-fine information system reveals the true productivity realization with 
probability ô and reveals an uninformative signal with probability (1—65).-Any infor- 
mation system with 0<6<1 conveys information regarding the true realization of the 
productivity parameter with noise. The level of noise strictly decreases in ô.. 

With the above assumptions, we can restrict our attention to the case in which 
q?(x|a, ¥=y)=S5p(x|a, y)+(1—6)E.p(x|a,c).° This: specification of the information 
system allows us to address not only whether the principal prefers the agent to be in- 
formed or not (6=1 versus 6=0), but also whether, given that the agent starts with a pri- 
vate pre-decision information system with known ô, the principal prefers to increase or 
decrease that ô. By allowing the agent to start with a private pre-decision information 
system with known ô, we allow for the possibility that the agent has expertise in gather- 
ing information. Note that we are allowing the agent to enter the contract with a private 
pre-decision information system, not with private pre-decision information. 

Although our particular representation of q’(x|a,7=y) entails some loss of 
generality in comparing two information systems with ô, and 6, where 6,, 6. (0,1), it is 
without loss of generality for comparisons between the perfectly informed case (= 1) 
and the uninformed case (6=0). We assume that the principal can choose 6 and are E 
terested in the welfare effects of that choice. , 

The sequence of events in the model is as follows. At the beginning of the serind: 
the principal and SE SEH enter into a binding contract. The agent enters with a pri- 


* Obviously, there are many possible ways in which the signal may be generated Sen the system is not in 
control. Our assumption is reasonable and enhances the tractability of the model. 

$ When referring to the signal received, we will use the notation = y only with the agent’s posterior belief 
function, q*(x|a, 9=y). The Ss y) represents the probability of outcome x given that action ais chosen 
and that the true productivity is y 


Baiman and Sivaramakrishnan—The Value of Private Pre-Decision Information . 751 


vate pre-decision Eu system of known ô (0<6s1), which the principal- may 
increase (if 5<1) or decrease (if 5>0), as agreed to in the contract. At the time of con- 
tracting, both the principal and the agent share the same beliefs about the production 
function and the agent’s private pre-decision information system. The agent- observes 
the signal realization from the private pre-décision information system and then takes a 
productive action a. At the end of the period, the outcome is realized and publicly ob- 
served, and the agent is compensated according to the contract. 
The principal’s problem corresponding to the case in which the agent has access to 

_ a éfine private pre-decision information system is given by — I, below. 


Program I 


Max ) \ et 
s(e)a(y) YY" x | 3 7 
subject to: | 


| j AS Ulap (xlaCyhy }+(1—8)E.(p(x aty} cjjax-a(y)}styidyæ U. (1,1) 
| Gstsiailëtslsisi lef -—E/tslelziellée es 


A U(s(a’,x})[5p(x|a’, y)+(1—8)E.(p(x|a’,c))}dx—a’ Vyand Vaa’ EA; (l2) 
X , S 7 3 


Notice that we allow for contracting on the agent’s action. Both this case and the . 
case in which the agent’s action cannot be contracted on will be studied in the next 
section. The first constraint (I.1) is the agent’s individual rationality constraint; the © 
second constraint (I.2) is the agent’s incentive compatibility constraint. When the 
. agent’s action can be contracted on, we do not require that the agent’s action choice be , 
represented by the first-order conditions. When the agent’s action cannot be contracted. 
on, the first-order conditions are a valid representation of the agent’s problem, given 
additional assumptions to be made later.” Because we are interested in the value of pri- 
vate pre-decision information rather than the value of communication, Program I is 
based on the assumption that the agent cannot communicate the signal realization to 
the principal. Further, once the contract is signed, the agent ç cannot quit, regardless of 
) the d information EE | 


Case 1—Observable Action KC 
In the first case, we examine a situation in which the contract can be written on 


both the outcome x and the agent’s action choice, The results of this pecuon are çon- 
tained in the GEN? observation and proposition. 


-. 7 See Jewitt EE and Rogerson (1985) for discussions of ES the first-order approach is valid, Also sep 


_ footnote 10 
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Observation. For each 4, the first-best solution; conditional on that ô, is achievable. 


Proposition 1. Given Assumptions (1.a) and (1.b) below: (1) any 6-fine private pre- 
decision information system {ô> 0) is strictly Pareto-preferred to the minimally 
fine private pre-decision information system (5=0); (2) starting at any 0<5<1, 
increasing 6 leads to a strict Pareto improvement; and (3) the maximally fine 
private pre-decision information system (ô= 1) is strictly Pareto-preferred to the 
minimally fine private pre-decision aoe system (6=0).° 


Assumptions. 


(1.a) The principal observes both x and a. 
(1.b) VaGA, 3 a set of positive méasure Y’ CY such that VyEY' ` 


| xtp.(xla,y)-E.(p.(xla,c)}dx>0. 


The idea behind the Observation is quite straight forward and no proof is given. Be- 
cause the principal observes the agent’s action (although not the action rule), the prin- 
cipal can make the agent indifferent across all action choices, and thereby induce the 

‘agent to implement the first-best action rule for any 6-fine system. The principal can do 
this by paying the agent s(a(y))=U~'(a(y)+U). 

The intuition behind Proposition 1 is also straight forward. Assumption (1 b) states 
that the private pre-decision signal has information about the marginal productivity of 
the agent. The principal would like the agent to choose high effort when marginal pro- 
ductivity would be high and choose low effort when marginal productivity would be 

_ low. Given the Observation, the principal can induce the agent to implement the first- 
best decision rule, which, because of Assumption (1.b), varies in y. 

` Part three of Proposition 1 clearly follows from parts one and two. The proof of part 
`- one follows by construction directly from Assumptions (1.a) and (1.b). in order to prove 
part two, it is necessary to establish the following condition: 


| x f [p(x| až (y) y)—E.(p(x|as (y),c)lg(y)dydx>0, -o DI 


` where-až (y) is the first-best action given a é-fine private sredacision information sys- 
tem, and the left hand side of condition (1) is the derivative with respect to.6 of the prin- 
cipal’s expected utility. Assumption (1.b) is sufficient to establish that condition (1) 
holds, which, in turn is sufficient to establish part two of Proposition 1. 

Condition (1) can be interpreted as the difference in the expected output or produc- 
tivity when the agent is informed about the true productivity parameter versus unin- 
formed, with his action rule, a¥(y) held fixed. This difference reflects the way in which 
a (y) coordinates between action and signal. For example, if ag ( y) is constant, this dif- . 
ference is zero. 


* The proposition could be stated more simply as, “Starting at any 0<8<1, increasing ô results in a strict 
Pareto improvement.” The reason for stating the proposition in multiple parts is two fold. First, it emphasizes 
the role of condition (1), which will also be important in the proof of proposition 2. Second, when the agent’s 
action choice is not contractible, proposition 2 must be stated in multiple parts because the sufficient condi- 
tions for the two parts are different. . l 
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Any increase in 6 will, by definition, allow the agent to better coordinate his action 
choices with the true productivity parameter realization, given his own induced pref- 
erences. However, whether the principal benefits from such improved coordination de- 
pends on the agent’s decision rule, which in turn depends upon the control that the 
principal exercises over the agent’s incentives. If the principal exercises little control 
over the agent’s incentives, then the agent’s and the principal’s interest will be quite 
divergent. In this case, allowing the agent to better coordinate his action with the true 
productivity parameter by increasing 6 may not benefit the principal. This is illustrated 
in Christensen’s negative-value example. 

When the agent’s action choice is contractible and, hence, the first-best action rule 
- can be induced, there is no divergence between the agent’s and the principal’s incen- 
tives and therefore condition (1) follows from Assumption (1.b). In the next case, the 
agent’s action choice is not contractible, and we will require a condition similar to 
condition (1) to hold in order to establish an analog for the second part of Proposition 1. 
However, the sufficient conditions to assure that the analog of condition (1) holds will 
be more stringent. 


Case 2—Unobservable Action 


When action is unobservable, as Christensen’s example illustrates, it is no longer 
clear whether the agency will be better off by improving the agent’s private pre- 
decision information system. To examine this case, we place additional structure on 
the production function by assuming that it satisfies the Linear Density Function Con- 
dition (LDFC).° In particular, we assume that p(x|a, y)=h(a, y)fa(x)+(1—h(a, y) f(x), 
where f(x) and f,(x) are density functions; f,(x) is preferred to f,{x)} in the sense of 
first-order stochastic dominance, and h(a, y) is a real-valued function that maps the 
agent’s action choice and the true productivity parameter realization into the interval 
[0,1].’° This production function is descriptive of a situation in which the agent’s action 
affects the probability h(a, y) of whether the machine on which he works is in control 
De, x is drawn from f„(x)) or out of control (i.e., x is drawn from f,(x)), but does not 
affect the distribution of x given the machine’s control state. 

With the above structure, Propositions 2 and 3 present conditions under which en- 
hancing the informativeness (increasing 6) of the agent’s private pre-decision informa- 
tion system is in the best interest of the agency. 


Proposition 2. (1) Given Assumptions (2.a)-{2.c) and (2.d’) below, any 6-fine private 
pre-decision information system (ô> 0) is strictly Pareto preferred to the mini- 
mally fine private pre-decision information system (6=0); and (2) given the As- 
sumptions (2.a)-(2.c) and (2.d) below, starting at any 0<5<1, increasing the 


* See Hart and Holmstrom (1987) for a further discussion of the properties of such production functions. 

‘© With LDFC and h(a,y) concave in a, the first-order approach is valid. In particular, given the 
reformulation of the problem discussed in the proof of Proposition 2 (see the appendix), the agent’s expected 
utility in the uninformed case is: 


| Í isbn wafelx) +O —wz)f.(x}[g(w)dxdw—k"(z) 


In this problem, the agent’s decision variable is z. The second derivative of the agent’s expected utility with 
respect to zis — k`” (z) which, by assumption, is negative. Therefore the agent’s expected utility is concave in 
z, and the first-order approach is valid. The same analysis holds for the informed case. 
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fineness of the agente private Gë information {increasing ô) leads to a 
strict Pareto improvement, 

"Assumptions. 


(2. a) The agent has a &-fine Ps eeben with a geen i. 
DG p(x|a,y)=h(a, y)fa(x)+(1—h(a,y))f:(x) where-h(a, y)=k(a)lCy). 
Ka c) The functions EN and I{y) satisfy the omic properties; .. . 


kiotsg Vo 
k’(a)>0 Va 
=- k”(a)<0 Ya 
l{(y)>0" ` Wu 
Itirizsp Yy.. 


F SES E S É cc) A af d l ; . Se . H 1 ! ` 
Ga SE ke <0 ` VS, where Kite Lis the inverse of Hm 
(2. d’ } kœ” (z) is concave in z. 


Given the. similarity of sufficient conditions adedine the two Se of Proposi- 
tion 2, we will primarily discuss the second part of Proposition 2. The proof of the ` 
second part of Proposition 2, like that of the second part of Proposition 1, depends upon | 
establishing that condition (1) holds, but with a,(y) being the optimal second-best action 
_ Tule for a given ô. Given the additional structure assumed for the din function, 
condition (1) can be simplified to: | 


B,[h(as(y),y)—E-h(asly),c)]>0. o 


‘But the Kette is the same. Because of moral hazard, the principal is now unable 
to perfectly align the agent’s interests ‘with bis own. As a result, it is less clear than in 
Case 1 that the principal would prefer the agent to better coordinate (relative to the . 
agent’s preferences) his action choice with the productivity parameter realization. That ` 

`. is, with moral hazard, it is less clear that allowing the agent to better coordinate his ac- . 
tion choice with the productivity parameter realization will result in SEERESES average ` 

productivity (condition (3)). 

2 We next discuss the role of the assumptions in establishing Proposition. 2. The as- 
sumption k’(a)>0 implies that higher effort levels lead to more favorable outcomes; 
k” (a) implies decreasing marginal returns to effort. These two assumptions are typical 
‘characteristics. of production functions. Similarly, the assumptions on I(y) merely 
order the- signals with higher realization of y implying higher likelihoods of more favor- 
. able outcomes. Taken together, these. assumptions imply that ha> 0." 

Assumptions (2.d) and (2.d’) are used to order the effect, on the agent’s.average in- 
duced action choice, of increasing the informativeness of his information: system. As- 
sumption (2.d’ ) is sufficient to establish that the agent’s average action choice is strictly 


_- greater in the informed case (6>0) than in the uninformed case (6=0). Assumption 


Gd, although formidable looking, can be given a natural interpretation in terms of the ` 
curvature of the agent’s disutility function. To see this, notice that the Gai SE 


1 This condition holds, for example, for k(a)= =a" forn<1. 
12 Subscripts denote Geen derivatives. 
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productive action a can be equivalently formulated as choosing kia: In this reform- > 
ulation k*(z) represents the agent’s disutility in choosing z. When the agent has a non- 
null information system, his action choice will vary in the signal reported. Therefore, 
the agent is bearing risk in terms of his uncertain action choice. Because k“(z{+)) is 
convex and hence —k™{z{ » )) is concave, the agent will have to be paid- a premium as 

' compensation for the risk involved in navig an, ex ante, uncertain action choice. 
Therefore 


bes s (2). i 
k” (2) 


Geen the E absolute work aversion, and ‘Assumption (2. d states that the 
agent’s disutility for effort exhibits decreasing absolute effort aversion.” Decreasing- 
- absolute effort aversion means that the higher the average effort induced, the smaller 
the risk premium per unit of risk that must be paid to the agent. Assumption CG d is 
used to establish that the agent’s average productivity increases with increased 
informativeness of the information system. 

Assumption (2:d’) is a special case of Assumption (2: d) in that (2.d’) implies (2.d), 
but (2.d) does not imply DG d'L Thus the conditions underlying the first part of Proposi- 
tion 2 are sufficient to assure the second part and hence to assure that the maximally 


- -fine private pre-decision information system (6=1) is strictly Pareto-preferred to the 


minimally fine private pre-decision information system (6=0). The first part of Proposi- 
tion 2 needs a stronger assumption than the second because its proof relies on a com- 
parison between a feasible but non-optimal solution to the 6>0 case and the optimal 
solution to the 6=0 case, while the proof of the second part relies ona comparison of ` 
optimal solutions as. 6 is varied. 

The assumptions which we have discussed so far do not appear to be overly restric- © 
tive and do not appear to be driving the result. The important assumptions appear to 
be LDFC, the multiplicative separability of h{a(y), y), and the implication that h,2>0. 
As Proposition 3 below shows, a strict welfare improvement can accrue to the agency 
even when higher- ae See the marginal productivity of the agent’s action 
(ha<0)} ` 


Proposition 3. Given the EEN below, starting at any 0<8<1, increasing the 
fineness-of the agent’s private pre-decision information (increasing ô) leads to 
a strict Pareto improvement.'* 


~ Assumptions. 
(3.a) ‘The agent has a ô-fine pre-decision information with a known 3. 


_{3.b) p(xla,y)=h(a, y)falx)+(1— h(a, y))f.(x) where h(a, y)=(1—e*”7). 
(3c) y>o. ` 


Only the LDFC and the aparabis captions remain to be Stach LDFC Sen 
the multiplicative SSC of h(a, y) play a crucial rolei in establishing Dee , 


Ren SO the ety function analog ofthe Pat Arrow mangaro cf bot kre fr iy 


functions. 
) SE 


756 SEN SC The Accounting Review, October 1991 


2 and 3.. With LDFC, we can express the welfare effect of a change in ô as: 


En dl Nee kaiia a 


all Leconte) fener} teil y)-E.h(aly), ol . 


dJlLnstsntdsd -Hrstéet sett Seite 


LDFC allows us to separate 
dL. 


iapa 


dô 


into three additive terms, each of which is the product of two terms. The terms in the { } 
brackets represent the effect on the agency’s welfare of a change in productivity. This ` 
effect depends directly only on the exogenously given distributions of x (Is) and 
fa(x)), and depends upon h{«,«) and the distribution of y only indirectly through the 
optimal compensation function s(x). The terms in the large [] brackets represent the 
effect on Hg agent’s rience of a change in 8. Thus, LDFC allows us to sign 


a 
dé | 


` by signing each of these effects individually, The sign of the terms in each of the ( 
brackets is positive. 

The’ multiplicative separability of AC e, el is then used to establish that the sign 
of the terms in each of the large [] brackets is positive. Notice that the terms inthe large | 
Il brackets are condition (3) or the partial derivative with respect to a of a variant of 
condition (3). To establish the sign of these terms, notice that with multiplicative 
separability, E,[h(a(y), y)—E.h(a(y),c)] can be rewritten as E,[k(a(y)){I(y)—E.1(c)}]: 


The other assumptions are sufficient to assure that k(a(y)), p(y), and I(y) are positively ` 


correlated, which establishes that the signs of the terms in the large [ brackets are 
strictly positive. This then establishes Proposition 2. > 

The importance of the separability assumption is further demonstrated by the 
example summarized in panel A of exhibit 1. The example shows that even with LDFC, 
in the absence of separability, a strict welfare loss can result from improving the agent’s ` 
_ private pre-decision information system. In this example, the agent chooses an action 
a€ A. If the action exceeds a random state realization, a, the firm’s outcome is success, 
otherwise, the outcome is failure. The random state occurrence is uniformly distributed 
either over the interval [1, 3] or [0, 4]. If the agent is given the private pre-decision infor- 
mation system, the signal informs him of the support of the random state occurrence. 
Note that the production function in panel A of exhibit 1 conforms to the LDFC speci- 
fication, with f(x) and RAR both degenerate. However, h(a, y) is not separable in a 
and y. 


1 
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Exhibit 1 , 
: An EE Demonstrating the EE of Separability 





Panel A. The Example: Oo l Panel B. Optimal Solutions: 
7 [a—-sa)Plas o+ alte ` Tee (1.48, 1} 
subject to e geg 2 | Principal’s expected utility 61.01 
- l Agent’s expected utility ' 15- 

U(6a)P{a<a)+ U(s.)Pfa>a)—V(a)= D i Second hast uninformed : | 
a* Cargmax{U(sq)P{a<a*)+ U(s,)P{a>a*)— V{(a*)|aE A} Optimal action {1.162029} 
Agent’s utility function=/w—a? ; a hes a ple E SE 
Agenta minimum expected utility =15; Second-best when the y realization publicly 
Outcome set= (xu, x1}={500, 300); | Optimal action rule {1.29,-1] 
Actions sst=A=1sa<3; d Yin, Principal’s expected utility 54.87 
Set of signals={y;, ya); ed 4 Aganta expactoa utility NK 
Prior beliefs over signals (0.8, 0.2; | ee ie 
Relationship between signals and outcomes; Optimal action rule {2, 1) 

yı=> a uniformly distributed over D. 3}; and Principal’s expected utility 35.594383 


ya=> a uniformly distributed over (0, 4}. | Agents expected utility 15 


_ The solution to this example is presented in panel B of exhibit 1. The solution indi- 

cates that, while the principal prefers the public pre-decision information case to the 
no-information case, the principal prefers the no-information case to the private infor- 
mation case. Thus, this example illustrates the negative welfare effect of improving the 


agent’s private pre-decision information system. Notice that in this case, when the 


agent privately observes the signal, his action choices are at least as great as in ae first- 


In this example, _ | 
prob(x=xz|y.)=—— prob(x=xz|y2)=—— 
dprob(x=xz|y:)/da=.5 aprob(x=xly2)/20= E? 


The realization of y affects both the probability of the high outcome and the marginal 
productivity of effort (unlike in the Christensen example). The marginal productivity of 
effort is decreasing in y (as assumed in Proposition 3). In addition, the signal y shifts-the 
outcome distribution to the right for a<2 and to the left for a>2. Thus, one difference 
between Propositions 2 and 3 and this example is that, in the example, the information 
signal does not unambiguously move the outcome distribution to the right in the sense 
of first-order stochastic dominance. This ability of the agent’s action choice to change 
the way in which the signal shifts the outcome distribution cannot arise with a separa- 
ble production function, but obviously can arise with a non-separable production func- 
tion. 

The above difference suggests one possible reason for the positive results of Prop- 
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ositions 2 and 3 and the negative result of the example. Relative to a problem in which 
the signal shifts the outcome distribution in the same direction regardless of the agent’s 
action choice, when the agent’s action choice can change the way the signal shifts the 
outcome distribution, the agent’s choice of action becomes more subtle and, it would 
‘seem, ‘more difficult to control. This would seem to imply that the principal’s control 
over the agent is reduced when. the production function ‘is.non-separable and the 
agent’s action choice affects how the signal shifts the outcome distribution. As dis- 
cussed earlier, with reduced control over the agent, the principal is more likély to be 
made worse off by improving, the. agent’s — ee ee system. 


v. Discussion and Summary 


l In this article we present conditions under which costlessly improving the agent’s 
private pre-decision information system results in a strict Pareto improvement. The 
sufficient conditions are such that an improvement in the agent’s private pre-decision 
information system leads to improved coordination of the agent action choice and the 
- realized productivity parameter and to higher average productivity. Notice that in the 
cases studied, the principal’s and agent’s preferences over the information. system 
-choices are unanimous. As a result, the same orderings would hold if it were the agent 
rather than the principal who had the right to choose between the costless information . 
systems and that choice was not observed by the principal. ` | 

We also looked at how sensitive our positive results were with respect to the under- 
lying assumptions. In particular, we found that in the moral hazard case, the positive ` 
result could be overturned by using a non-separable production function. versus the ` 
assumed separable production function. A negative value to improving the agent’s 
_ private pre-decision information system might arise in other situations. One setting in 
which a welfare loss could occur is when the labor market is such that granting the 
agent access to a pre-decision information system allows the agent to quit the firm after 
“observing that information. Granting the agent access to the pre-decision information. 

system allows the agent to extract informational rents from the principal. These infor- 
mation rents may be greater than any gross benefit to the principal from the agent’s 
being able to take more appropriate actions. 

A negative value could also result when the agent’s action choice is discrete. In an 
extreme situation, even if the private information signal is informative about the agent’s 
marginal productivity, the feasible set of action choices may be such that the optimal 
informed decision rule is the same as the optimal uninformed action choice. However, 
endowing the agent with a private pre-decision information system increases the prin- 
cipal’s control problem because it increases the number of incentive compatibility con- 
straints in the problem. Thus, the principal is inducing the same behavior from the 

. agent in the privately informed case as in the uninformed case, but it is more expensive 
to do so. A less extreme example of this point is illustrated by the example in panel A of 
_ exhibit 2 and its solution in- panel B. 

Notice that in this example, the private signal has. information about the marginal ` 
productivity of the action taken, the production: function has a nonmovable support, 
and the production function satisfies first-order stochastic dominance with respect to 
both the signal and the agent’s action choice. . : 

In the first-best case, the principal is made better off hein the information signal is 
publicly revealed. However, in the second-best case, the principal strictly prefers to 
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Exhibit 2 
An Example of Negative Value with Discrete Action Choice 


Panel A. The Example: 

Principal’s utility function G(x- w)=x—w; 

Agent’s utility function=~V/w—a? ; 

Agent’s minimum expected utility = 250; 

Outcome set={50,000; 100,000; 300,000}; 

Actions set=(1, 3, 5); 

Set of signals ={y,, Y2, ys}; 

Prior beliefs over signals f4, Z gé and 
Relationship between actions, signals, and outcomes: 


yı LO Mi 
Zi X2 Xa Ké X2 X3 Ai Aa X3 
ay 0.7 0.2 0.1 0.5 0.3 0.2 0.45 0.3 0.25 
03 0.85 0.22 0.13 0.45 0.25 0.3 0.4 0.25 0:35 
fly 0.6 0.25 0.15 0.4 0.2 0.4 0.1 0.2 0.7 


Panel B. Optimal Solutions: 


First-best informed Second-best with the y realization publicly 
Optimal action rule faz, Qs, 3} observed 
Principal’s expected utility 90,974 Optimal action rule (a1, A, Gs} 
Agent’s expected utility 250 Principal’s expected utility 87,362 
Sucnad heat dinin formed Agent’s expected utility 250 
Optimal action rule ds Second-best when the agent privately observes 
Principal's expected utility 87,990 the y realization 
Agent’s expected utility 250 Optimal action rule IO: G2, Oal 
Principal’s expected utility 83,495 
Agent’s expected utility 250 





keep the agent uninformed rather than install a pre-decision information system, 
whether private or public.** The cost of inducing the agent to take the appropriate in- 
formed actions (whether based on a private or public information system) exceeds the 
benefit from doing so. The reason for this appears to be the discreteness of the action 
set. This inference is supported by the fact that, although the signals are informative 
with respect to the agent’s marginal productivity, the optimal action rules for the 
second best public information system case, the first-best informed case and the 
optimal action in the uninformed case differ only for the low-signal outcome. 

This present work can be naturally extended in a number of directions. First, there 
is a need to establish similar sufficient conditions for more general settings. Second, it 
would be interesting to see how the sufficient conditions can be weakened when com- 
munication is allowed. Third, and perhaps of most interest, it would be useful to find 
sufficient conditions under which improving the agent’s private pre-decision informa- 
tion system leads to a reduction in the agency’s welfare. 


46 Note that when the pre-decision information system generates a public signal, the principal can contract 
on this signal as well as the outcome. 
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Appendix A 
Proofs of Propositions 


Proof of Proposition 1 
Part 1. | 
The first-best uninformed arrangement is the solution to the following problem: 
oe) xE.p(x|a,c)dx—U-'(a+ 0). 
x 
The first-best action a* satisfies the following first-order condition: 
f xE.p.(xla*,c)dx— U~ (a*+0)=0. (1.1) 
k : S 


A strict Pareto improvement can be established by giving the agent a 3-fine (6>0) private pre-decision in- 
formation system in the following way. Direct the agent to implement the following action rule. Let: 


a(y)=a* for all y€ Y’, 
a(y)=a*+ae for yEY’. 

In turn, let the compensation rule be: | 
s(aj=U"(0 +a") for action=a*, 


s(a}=U"'(a*+ae+U0) for action=a*+ae. 


Clearly, the agent is indifferent between the two acceptable actions no matter which signal he observes. 
Therefore, the agent will be willing to implement the above action rule, given the above compensation rule. 
Further, under the above decision rules, the agent is indifferent between being uninformed and being pri- 
vately informed. 

The only point remaining to check is whether the above feasible action and compensation rules leave the 
principal strictly better off than with an uninformed agent. The principal’s expected utility under the new sys- 
tem is: 


so \ [xq*(x|a*, y)—U"(a* + O)jdxg(y)dy 
Xv yY’ 


ch) [xq*(x|a*+ae,y)—U"'(a*+ae + 0)]dxg(y)dy. 
Xv yey’ 


Notice that the principal’s expected utility is the same as in the uninformed case when a=0. Taking the 
derivative of the above with respect to e, substituting condition (1.1}, and using the definition of q*{x]a, y) 
gives: 


ðEU” 
ða 


Given Assumption (1.b), choosing e >0 results in a strict Pareto improvement. . 





| SH H x{pa(xla*,y)~E.ptla*,c)}ex a y)ay. 
anQ yEY' X 


Q.E.D. 


Part 2. 
The principal’s expected utility when the agent has access to a 6fine information system and the agent's 
action is observable is given by: 


EUS= | {f xpa y)-E.p(x|a(y)} c) +E.(p(x|a{y),c))]dx (1.2) 


-U0 +aly)}s(y)dy, 
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where asl y) is the first-best action strategy, given 6. 
The principal will benefit by increasing 6 if: 


dEUS 
dé 
where the above equality follows from the Envelope Theorem. 


Recall from part 1, that the principal will strictly prefer an informed agent (ô> 0) ta an uninformed agent 
(8=0) if Assumption (1.b) holds. This assumption also implies that: 


=| x f [ptxlaly}y)-E.píxlasty}o)igty)dy |dx>0, Da 


| x f „P-ta, y)-E.(p.(x|a,o)]g(y)dy |dx>0.  (4b’) 


The proof of part 2 consists of demonstrating that Assumption {1.b' }=> condition (1.3). We will assume 
that Assumption (1.b) and hence Assumption (1.b’} hold but that condition (1.3) SE not hold and derive a 
contradiction. 

Let a,(y) be the optimal first-best action rule for 6>0 and assume -that condition nä does not hold, that 


f J Slgtsloin) y)— E.p(x|aa(y), c)lgly)}dydx so. | (1.4) 


Substituting this into conuition (1.2), gives: 


EU} < | J LE -p(x aly}, c)— U"(U+a,(y))}g(y)dydx. Da 


Let a* be the first-best action in the ë-—0 case. Then the optimality of a* for the ô=0 case implies: 


BUS s ) | PE-plxjaly), c)—- U` (U +a{y))]g{y)dydx (2.8) 


T 


= f lr atder elt +a®gty)dyax 


Equation (1.6) says that the principal weakly prefers an uninformed agent (5=0) to an informed agent 
(6>0). But this contradicts the implication of Assumption (1.b) that the principal strictly prefers an informed 
agent to an uninformed agent (i.e., part 1 of Proposition 1). Therefore SEET (1.b) implies condition (1,3) 
which establishes part 2 of Proposition 1, 

Q.E.D. 


Proof of Proposition 3 _ 


For both parts of this proof, it will be convenient to restate the problem in a slightly different but equiva- 
lent manner. First, with h(a, y)=k(a)]l(y}) and 1’{y}>0, we can redefine the information variable. to be 
w=l(y) rather than y. Let g(w) be the induced probability distribution over the information variable w. 
Second, instead of having the agent choose effort a with productivity effect k(a{w)), where k*(a)>Q and 
k”(a)<0O, and disutility a{w), we can have the agent choose z(w)=k(a(w)} with disutility of k-'{z{w)}, where 
k-'{ +) represents the inverse of k(«). Finally, k’(a)>0 and k”(a)<0 imply that: 


>0 and Ringe Oa 


: (2.0) 
k’(a) (k’ (a)? 


k" (z)= 





Part 1. 


The principal’ s maximization problem, given the above transformation and given that the agent js unin- 
formed, is: 


Max 1 (x—8(x))[w2fn(x)+(1—wz)f.(x))g(w)dxdw 
sfe); z dë 
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f f U(s(x))[w2fu(x) + (1— wz) f(x)]g(w)dxdw—k'(z)2U, 
XJ W 


DI f f U(s(x))w[fa(x)—f:(x)]jg(w)dxdw-k" (z)=0. 
Xv W e 


Let {s"(x), 27} be the optimal solution to the above problem. Then the first-order condition (ii) can be 
rearranged to yield: 


f U(E" DIfa (x) -fi (dE, (2.1) 
X Hr 


where w=E(w). 
Let » be the Lagrange multiplier on constraint (ii). It is straight forward to establish that p> 0. The adjoint 
condition yields: 


Karten 
w 


E =0, 


Because nz, w>0, and k"'”(z”)>0, we have, 
E (x—s8"(x))[fa(x)—fi(x}]dx>0. l (2.2) 


Turning now to the informed agent {6>0), we assume that the agent is given the uninformed contract 
s” (x). The first-order condition corresponding to the choice of z(w) can be written as: 


UWD y o 


| ui (xDifa(x)-f:(x)]dx= RE EF (2.3) 
From conditions (2.1) and (2.3), we have, 
REG) tal) for all w. ar (2.4) 


| w  éw+(1—3)w 
Because the left hand side of condition (2.4) is independent of w, and because the denominator of the right 
hand side of condition (2.4) is increasing in w, it must be the case that the numerator of right hand side of 
condition (2.4) is also increasing in w. Because k`” (z(w))>0, this implies z’(w)>0. 
Also rearranging condition (2.4) and taking expectation with respect to w, we can see that: 
k (2"%)=E(k'’ (z(w))). (2.5) 
If k'’( +) is concave, this in turn implies that: ) 
kU (z"°) <k" (E(z(w))), 
or 
z°<E(z(w))=2. (2.6) 


That is, the average productivity Z is higher in the informed case than the productivity z" in the unin- 
formed case. 

The agent is weakly better off in the informed case (ô> 0) than the uninformed case (6=0) because he has 
the same compensation rule in both and can always choose z" in the informed case regardless of the signal. 
Strict Pareto superiority is established by showing that the principal is strictly better off with an informed 
agent (5>0). The difference in the principal’s expected utility between the informed and uninformed cases is 
given by: 


% 
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AEUS= A PrN 8[2(w) wha(x)+(1—z(w) w) fi. (x)] 

+(1-5)[z(w) Wf,.(xJ+(1—z(w) W) f(x) e(widxdw 

SS II EE "eh Seil, (lt giw ideen, 
which, after some algebraic manipulation, gives: 

= rett —f.(x)Jdx({SE((z(w)—z’)w)+(1—8) W(Z—z")}. 
Because: 
P (x—8"(x))[falx)—fr(x)]dx>0 

(from condition [2.2], and San (from condition [2.6], to establish that AEUS>0, it suffices to show that 


E[(z(w)—z")w]>0. 
Now, because z’ (w)>0, (z(w)—2), Snditw 2a are positively correlated, that is: 


E[(z(w)—Z)w]>0. 
But because Z2>z” (from condition [2.6], and w>0, 
E[(z(w)—2") w]>Ef[(z(w)—Z) w]>0. 
Hence, the principal is strictly better off. 
Q.E.D. 


Part 2. 


The principal’s maximization problem, given that the agent has access to a éfine information system, is 
given by the following program. 


max | Irsiitgtsteitdsleh steel ii) 
LD -—äll Sim) ëtel sl tt —2lwlgift, (x) ]g(w)dxdw ` ` 
subject to ) I , U(s(x))[5{z(w)wfa(x)+(1—z(w)w) f.(x)} 
+(1—28){z(w) Wfa(x)+(1—z(w) Ww f.(x)I]-—k'(z(w))g(w)dwdxs 0 
| UN-A w+- mdr- Teil Vw. 
The agent's first-order condition can be rewritten as: d 


E A Kw) ` 
| vecno fı(x)]dx E D Ww. (2.7) 


Note that: 


Ki (z(w))>0> | Ueo- do. 


Ëa 
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Further, the second-order condition with respect to the agent's action choice is satisfied, that is; 
k>” (z(w))<0, from k`“ (z(w)}>0. 
The er equation, with respect to z{w) is: 


Ir aile (2-H) =) k*" et) =0 Vw. 


Proof: Taking the derivative of condition (2.7) with respect to w gives: 
(w+(1—5)W)k"” (2(w)}d2(w)—8k-"’ (2(w)}dw=0 - 
dz(w) S k" {z{w)) = 
dn (8w+(1—é8)w)k*” (z(w)) 
because kat, )>0 and Kit: lp from condition (2.0). 


Claim 2.1: 


Q.E.D, 
AU 

dw l 
Proof: Divide the adjoint equation by the agent’s decision rule: . 


Claim 2.2: 


| ERON OE e eg d 
f. U(s(x))( falx)—filx))(ow + (1-3) W)dx 


SACC 
k (2(w)) 


z(w) will vary in w in an opposite way from SE its coefficient varies in w because the left hand side of 
the above equation is independent of w. 


pl) 


du(w) >0 bec suse ZE | <0 by assumption. 
dw k'(2(w)) Jo omnes 
Claim 2.3: l piw) >0 Yw, 


and hence: 


\ [x~a(x)]L fax) fal x)]dx>0. 


Proof: From the adjoint equation and k-*”{z)>0, SS u(w)20 Vw, or a{w})}<0 Vw. Assume that 
u{w)<0 for all w. 
Let: 


Xt={x[fulx)>fclx)}, and 
x= {xi fa(x)Sfi(x)}. 
From the first-order condition with respect to s(x), this implies that: 
Sa ie 
U’(s(x)) 


eu whenever XC X. 
U"(s(x)) `, 


<À whenever xE Kr, 
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This, in turn, implies that: 
1 1 
Lee D me 
U'(s(xı)) U’ (s(x) 
Given concavity of the utility function: 
8(X:})>s(x.) for all x, EX, x EX". (2.8) 
From the adjoint equation, u(w)<0 implies that: 


for all x, E€ X- and x,€X*. 


E —s(x)][fulx) ~fi(x)}dx<0, 
or 
| Att-Atsitge- | s0<tfaG3-f.0alde<0. 
Bocause f,(x} FOSD f.(x), the first term in the left hand side of the above Is positive, and hence: 
| s0athat-f.0ldx>o, 
S 
= 8(x)[fulx)—fi(x)]dx> KE Stellt) -feislldn, 


Now let s* = Inf [{s{x},x€X*}. Then equation (2.8) implies that s* > s{x) for all ENG, Further, there exists 
an e >0 such that s*— « >s(x) for all xX". Then, 


& 


| (s*~ e)[fa{x)—f:(x)]dx > | s{x)[falx)—fı{(x)]dx 
DEA xGX* : 
> f s(x)[f:(x)—fa(x)]dx 
xEX~ 


> ) s*(flx)—fulx}]dx, 
x€X” 
which implies that: 
=é la (falx)—fi(x)]dx>0. 


But since f.(x)>f.(x) Vx€X* and €>0, the above inequality cannot hold, thus yielding a SES 
Hence, p{w)20, Vw. «(w)}>0 follows from the presence of moral hazard. 
Q.E.D. 


Finally, the effect of a marginal change in å on the principal’s welfare is: 


—— 3 | È xstonttate—fetartztoitw—myetw)ddw 
af J 0 eNO ~ fi z(wi(w— W)g(w)dxdw 


lU veemtica—foamontw—migtwidaaw, 
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From claims (2.1) and (2.2), both z(w) and (w) increase in w. Further (w—f) increases in w. Thus both 
z(w) and (w—®) and p(w) and (w—W#) are positively correlated, and hence the three terms are positive. 
Q.E.D. 
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SYNOPSIS AND INTRODUCT ION: This study addresses the issue of 
self-selection bias in the analysis of economic consequences of mandatory 
accounting changes. Self-selection bias arises from the use of truncated, 
nonrandom samples to assess the behavior of firms using different ac- 
counting methods at the time of the mandated change. Using ordinary 
least squares (OLS) to. estimate regression models containing data gener- 
ated by.self-selected firms can yield inconsistent and inefficient estimates 
of regression parameters. 

The present study uses the case of SFAS No. 2, promulgated i in 1974, 
to illustrate the effects of selection blas on studying the economic 
consequences of accounting regulation. The estimation method used to 
correct for self-selectivity Is a two-stage switching regression procedure 
developed by Heckman (1976, 1979) and Lee (1976, 1978). Employing this 

research, method requires developing a complete mode! that explains the 
_ accounting choice decision and R&D investment decision. 

_ The switching regression model is estimated with data from 1973 to © 

correct for self-selection bias and to predict the likely economic’ con- 
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- sequences of SFAS No. 2 prior to its adoption. The unbiased estimates of. 
the R&D equations are then used with the Wald test to examine structural 
| changes in the R&D model after the implementation of SFAS' No 2. To 
examine the sensitivity of the results to self-selection bias, the analysis | is 
replicated with OLS estimates. SS 
‘The results of the switching regression analysis indicate that selection - 
© bias exists-in. both the capitalizing and expensing groups. -This bias is 
. further shown in systematic differences between the results of OLS and - 
switching regression estimates. The OLS estimates consistently understate ` 
‘the predicted values of R&D expenditures for both groups and appear to 
understate the negative impact: of SFAS No. 2 on the capitalizers’ R&D SE 
expenditures. _ 
The results of the ‘Wald test shaw that observed changes. in the | 
capitalizers’ R&D spending behavior after 1974 are attributable,. at least in . 
_ part, to general macroeconomic phenomena. However, after controlling for — 
_ the effects of economywide changes, the analysis shows an incremental `- 
_effect.of SFAS No. 2 on the R&D expenditures of former capitalizers. - 


Key Words: Seff-selection bias, Switching regression, R&D, SFAS No. 2, 
Economic consequences of. accounting regulation. 


Data Availability: The data will be made available upon see from the 
author. 


HE remainder of the paper is organized as follows. Section I presents an over- 

view of the sources of selection bias. Section Il describes the research method- 

ology and the estimation procedures used to correct for self-selection bias in the 
case of SFAS No. 2. Section III discusses the variables included in the accounting- 
choice equation and the R&D investment equation. Section IV. describes the sample 
selection procedures and provides some descriptive statistics. The empirical results are 
reported i in section V and section VI summarizes the research findings. l 


L Sources of Selection Bias in Economic Consequences Research 


The sources of self-selection bias in economic consequences research are varied in 
several dimensions.' First, self-selection bias often occurs when managers must choose 
between options (¢e.g., accounting methods). Managers do not choose randomly from 
the available accounting EE but rather on the basis of the firm’ 8 eo 


' The Sta of self-selection bias was first introduced by Roy (1951), who based an sain intuitive = 

planation of this problem.and its consequences on the theory of comparative advantage. The eco 
lated to self-salectivity were discussed by Gronau (1974), Lewis (1974), and Heckman (1974). Formal SE 
techniques were developed by Heckman (1976) and Lee (1976). Extension and further refinement of the estima- 
` tion techniques were developed by Heckman (1979), Lee (1982, 1983), and Lee et al. (1980), among others. Al- 

though the problem of self-selection bias has long been recognized by economists and has been addressed in 

such areas as union membership, choice of occupation, housing, schooling, and labor force participation, it is 
only recently that accountants have attempted to address this issue in the accounting context. Abdel-khalik 
(19900, 1990b) examines the problem of self-selection bias in two areas in the field of accounting. Also see 

Acharya (1988), Smith (1987), and Spivey. (1989) for applications in the field of corporate finance. 7 
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tics and the comparative advantages of each method.” Because of this self-selection pro- 
cess, Managements’ economic decisions are conditional, at least in part, on the ac- 
counting-choice decisions. Thus, contrasting the behavior of a sample of experimental 
firms with the behavior of a sample of control firms at the time of the accounting 
change involves a selection bias problem that results from the nonrandom assignment 
of sample firms to the two groups. Heckman (1979, 1980) points out that using data gen- 
erated from nonrandomly selected samples to estimate behavioral relationships may 
confound the behavioral parameters of interest with parameters of function that deter- 
mine whether a particular observation is in the experimental or control sample. 

Second, as firms’ characteristics change over time, managers’ preferences for dif- 
ferent accounting alternatives may also change and result in a voluntary switch from 
one method to another. This may cause a sample selection bias in the form of a trunca- 
tion problem. f 

Third, selection bias may arise when the choice of accounting method and the rele- 
vant economic decision (e.g., R&D) are jointly determined by a common set of unob- 
servable factors. In this case, the error terms in the functions explaining the accounting 
choice and economic decision would be correlated and have nonzero expectations. 

Finally, sample selection bias may result from specifying certain criteria for select- 
ing sample observations. Heckman (1980, 207) points out that “the effect of such cri- 
teria operates in precisely the same fashion as self-selection: fitted functions confound 
behavioral functions with sample selection functions.” 

Recent developments in econometrics suggest that, in the presence of self-selection 
bias, using OLS in the usual fashion to estimate regression models could result in inef- 
ficient and inconsistent estimates (Barnow et 4l. 1981; Heckman 1976, 1979; Lee 1976, 
1978; Maddala 1983). Several two-stage switching regression methods have been devel- 
oped to correct for this bias.’ | 

The two-stage switching regression methodology has several advantages compared 
with primary research methods: (1) it provides a means for consistently and efficiently 
estimating coefficients of models in the presence of selection bias, (2) it allows re- 
searchers to explicitly model and examine the relationship between the accounting- 
choice decision and the related production-investment decision, and (3) it provides ex- 
pectations of the likely economic consequences of proposed accounting changes prior 
to their adoption, which might provide useful input into the rule-making process. 


II. Application of Switching Regression Analysis to SFAS No. 2 


In October 1974, the FASB issued SFAS No. 2, which required all firms to expense 
R&D costs as incurred. It has been argued that the elimination of the deferral option 
might induce managers to alter their R&D investment decisions, thus producing unde 
sirable economic consequences. For example, Horwitz and Kolodny (1981, 253) state 
that “spokesmen for the venture capital industry and managers of several small firms 


* The accounting literature lacks a complete theory that explains managerial selection of accounting 
methods and their associations with-production-investment decisions. However, a growing body of literature 
embraces the concept of rational choice as a descriptive paradigm to explain managerial selection of account- 
ing methods and views it as an integral part of the overall decision-making process, which, in itself, is guided by 
the objective of the firm (see, e.g., Collins et al. 1981; Demski 1973; Dhaliwal 1984; Gordon 1964; Holthausen 
and Leftwich 1983; [irl 1968). 

? For a summary review of these methods and their applications, see Amemiya (1985) and Maddala (1983). 
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testified at the hearings on the R&D issue in March 1974, that affected firms would alter 
their behavior to ‘cushion’ the increased earnings volatility and lower earnings level 
that would otherwise ensue.” To address this issue, several studies (Dukes et al. 1980; 
Horwitz and Kolodny 1980; Elliott et al. 1984) have compared observed changes in R&D 
expenditures for a sample of capitalizing firms with those of expensing firms during the 
period surrounding the issuance of SFAS No. 2.‘ Since firms are not randomly assigned 
to the two samples, a potential self-selection bias may be introduced in assessing the im- 
pact of this accounting change on firms’ R&D expenditures. The present study corrects 
for this limitation by employing a two-stage switching regression methodology. 


Formulation of the Switching Regression Model 


Managers make two interrelated decisions regardng R&D activities: how much to 
invest in R&D, and which accounting method to choose. Let Z be a vector of exogenous 
variables representing firm characteristics that influence the accounting-choice deci- 
sion, and let X be a vector of exogenous variables determining the R&D investment 
decision.’ The switching regression model consists of three equations and may be 
specified as follows: | 


ACF =Z,0-— Ers ` (1) 
R&Dai=XubitUu, and i (2) 
R&Dz, = X2,83+ Ua, (3) 


where AC* is a latent variable representing the firm’s preference either to capitalize 
R&D costs (AC=1) or expense them (AC=0); R&Dc, and R&Dz, are observed R&D 
expenditures for capitalizers and expensers, respectively; 0, 8,, and 8, are vectors of co- 
efficients; and U,,, Uzu, and e, are random error terms, 

Equations (2) and (3) cannot be estimated directly by OLS because the presence of 
selectivity bias causes U,,, Uz to be correlated with e, Estimation methods that ignore 
this correlation discount important information about the relationship between the ac- 
counting choice and R&D investment decisions and may lead to biased estimates of re- 
gression parameters. The covariance matrix of (U,,, Uz, and el is assumed to be 
trivariate normally distributed, and the conditional mean of the error terms U, and Uz: 
can be shown as follows (Heckman 1976; Lee 1976; Maddala 1983): 


E[Uu| €,<Z,0)=E[oue,| ere 2,0] 


= O16 BS and 
(2,6) 


E[U,,| €,>Z,0)=E[on€;| €,>Z,6] 


$(Z,9) 
“4 —8(Z,0) 


‘ These studies assume that sample firms are drawn from a homogenous population and are randomly 
assigned to the two groups, | 

* In this article, it is assumed that the choice of a particular method depends on the following character- 
istics of each firm: capital structure, firm size, volatility of reported earnings, volatility of R&D, and materiality 
. of R&D expenditures. The R&D decision is assumed to be a function of firm size, cash flows, the level of capital 
investment, and the extent of the firm's product diversification. A brief discussion of the theories and empiri- 
cal work supporting the variables included in both models is presented in section 3. 
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where o,,.=Cov(U,,, ech 02.=Cov(U2, el & is the cumulative distribution of the 
standard normal function, and ¢ is its density function. The terms: 


|- sc) 
GE 


| (2,0) 
1—#(Z,6)| 


and 


known as the Mills ratios, represent the selectivity correction terms for each sample 
observation in the capitalizing and expensing groups, respectively. 

The estimation method adopted here is a two-stage estimation technique developed 
by Heckman (1976, 1979) and Lee (1976, 1978). In the first stage, the accounting-choice 
equation is estimated for the total sample using the probit analysis. The estimated value 
(Z,6) is then used to generate the Mills ratio for each sample observation. In the second 
stage, the Mills ratios are added to equations (2) and (3), which are then estimated by 
OLS. Equations (2) and (3), therefore, can be rewritten as follows:® 





Kä 
R&De:=X:81t O16 KE: T | ton and (4) 
AE, 
ER) 
R&D,z,=X,82+02,.—_ t ou, 5 
aK Pat ou OO ts 6 


where wu and wa are random error terms with zero expectations. Thus, estimating 
equations (4) and (5) by OLS provides consistent estimates of 8, and 83.’ 

It should be noted that the selectivity term measures the covariance between the 
accounting-choice decision and R&D investment decision. Thus, the statistical signifi- 
cance of the estimated coefficient of this variable provides useful information about the 
extent to which the two decisions are interrelated. If accounting method choice and 
R&D spending are independent, then an involuntary switch to the expense method 
should not induce managers of previous capitalizers to alter their R&D investment deci- 
sions. Thus, the null hypothesis of no impact of SFAS No. 2 on previous capitalizers’ 
R&D expenditures can be stated as: 


Ho: 01.20. 


S There are two potential problems with these estimation procedures. First, the error terms in equations (4) 
and (5) are heteroskedastic. This can be corrected either by White’s (1980) correction procedures or by estimat- 
ing equations (4) and (5) with weighted least squares rather than OLS (Maddala 1983, 225). Second, Lee et al. 
(1980) show that the true variances in equations ene (5) will be underestimated since the selectivity variables 
are estimates. The computer package used in this study, LIMDEP, automatically corrects for this problem. 

7 The analysis assumes that the variables influencing the accounting-choice decision are purely exoge- 
nous. However, it can be argued that accounting choice and R&D investment decisions are simultaneously 
determined. In this case, R&D should be treated as an endogenous variable in the accounting-cost equation and 
the model should be estimated simultaneously. 

This estimation method was considered in an earlier version of this article, but Professor Maddala (1991) 
pointed out that treating R&D as an endogenous variable in the accounting-choice equation caused the model to 
be logically inconsistent. Therefore, the simultaneous analysis has been omitted here. 
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- Maddala (1983, 1991) points out that, although the major concern in the analysis of 
solf-gelactivity is testing for selectivity bias, it is important to estimate the mean value of 
the dependent variables for the alternative choice. Thus, 8, from equation (5) can be 
used to compute the mean value of R&D expenditures for capitalizing firms had they 
chosen to be expensers, and ĝ, from equation (4) can be used to predict the mean value 
of expensing firms had they elected to be capitalizers.? ` 

A comparison of the expected value of R&D for each group under their chosen 
method and the.expected value of R&D under the nonpreferred alternative provides a 
measure of the likely change in firms’ R&D spending behavior if they were forced to 
switch to the other alternative. 


Testing for Structural Changes in the R&D Model 


After correction for selection bias, the reaction of capitalizing firms to the imple- 
mentation of SFAS No. 2 can be examined by testing for structural change in the R&D 
model during the years 1975-1978. Three basic tests for structural change are con- 
ducted (see Johnston 1984, 207-25): (1) test of differential regression De, both inter- 
cepts and slopes), (2) test of differential intercepts, and (3) test of differential slopes. The 
null EE of no structural changes can be set as: 


(os! Je 


~ [Bal Il 


HI. Models’ Explanatory Variables 
Determinants of Accounting Method Choice 


1. Leverage: The higher the leverage, the higher the probability of defaulting on debt 
(Smith and Warner 1979) and the greater the use of debt covenant restrictions. These 
covenants induce certain managerial behavior, including the choice of accounting 
methods. Research findings have shown that highly leveraged firms tend to choose 
income-increasing accounting methods such as capitalization of R&D (Bowen et al. 
1981; Daley and Vigeland 1983; Dhaliwal 1980; Leftwich 1980). 

2. Firm Size: Size as an explanatory variable is consistent with various hypotheses, 
notably the political cost hypothesis (Dhaliwal et al. 1982; Hagerman and Zmijewski 
1979; Watts and Zimmerman 1978). In particular, it is argued that large firms will 
tend to reduce their political cost by selecting income-decreasing policies, such as 
the expensing of R&D. 

3. Earnings Volatility: The income-smoothing literature, beginning with Gordon 
(1964), suggests that managers’ interests in avoiding income volatility lead them to 
manage discretionary expenditures so as to smooth reported earnings (see, e.g., Lys 

1984). Thus, it is expected that firms with highly volatile business will tend to choose 
capitalization. 

4, Materiality of R&D Eiere The National Science Foundation annual survey 
(1980) shows that company-financed R&D as a percentage of sales varies consider- 
ably between firms and across industries. This evidence is SE corroborated by 


= 


s Note tbat the estimated costficient of the selectivity variables (o4) is not used in Se these predic- 
tions nor in any further analysis. These terms are used to determine whether selection bias exists and to pro- 
vide consistent estimates of 6. 
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Madden et al. (1972), who showed that the percentage of R&D to sales averages 7 
between 2.5 and 50 percent. This suggests that reported income could be highly 
+ sensitive to the amount of R&D costs charged to the income statement. Thus, it is 
. expected that the higher the relative magnitude of R&D outlays to reported income, 
the.more likely managers are to capitalize R&D. 

5. Volatility of R&D Expenditures: Elliott et al. (1984) report evidence suggesting that 
capitalizing firms have greater volatility of R&D expenditures than expensing firms. 
This finding indicates that firms with less stationary streams of R&D projects tend to 

' capitalize R&D. This is consistent with the notion that the smoothing of annual in- . 
Come can be achieved better by amortization of expenditures. Thus, the variability of 
R&D projects is a descriptive factor that distinguishes between ae and 


expensing firms. 
Determinants of R&D Investments 


1. Firm Size: A debate has been end about- ihe efficacy of atl s (1950) 
hypothesis asserting that larger firms are financially better equipped to undertake 
large-scale R&D projects than are smaller firms. The hypothesis finds support in 
Scherer (1984) but is contradicted by others (Mason 1951; Nelson 1959) because of 
assumed deterioration of incentives. However, several studies have reported evi- 

. dence showing that R&D is proportionately related to firm size (Grabowski 1968; 
Scherer 1967; Shrieves 1978). To assess the validity of this claim, firm size and 
squared values of firm size are used as two explanatory variables. 

2. Cash Flow: The ability to finance R&D with external funds could be limited by the ~ 

-riskiness of R&D -activities (Bozeman and Link 1984; Grabowski 1968; Kamian and 
Schwartz 1978; Link and Long 1981; Nelson and Winter 1982). Thus, internal financ- 
ing availability becomes critical. The higher the cash flow generated internally, the 

- greater the firm’s ability to commit to risky R&D projects. That is, the correlation is 
expected to be positive. | 

3. Capital Investments: Prosperous firms tend to invest i in both R&D and other capital 

- projects, although these are competing alternatives for the use of a firm’s cash (Gra- 

_ bowski and Mueller 1972). However, the evidence is inconsistent; the National 
Science Foundation (1964) and Crowell (1987) report positive association. between 
the two types of investments. Although no directional relationship can be posited 
between R&D and capital investments, the latter is used in this study as a. competing 
alternative for the commitments made by the firm. | 

4, Product Diversification: Nelson (1959) asserts that, due to the uncertainty of the 

_. outcome of research activities, highly diversified firms will be better able to absorb 

_ the shocks of unexpected results. Highly diversified firms invest in a greater number 

` of projects, which reduces the risk inherent in R&D activities (Scherer 1967). Thus, 

. the higher the degree of diversification, the lower the expected risk inherent in in- 

_ vesting in R&D activity. - 


Vv. Sample Selection and Data 


` The gereest sample consists of firms that capitalized RED and were J identified 
by reviewing the Disclosure Journal Index of Corporate Events during the period from 
May 1974 to April 1976. In addition, Moody’s Industrial and OTC manuals were exam- 
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Table 1 
Operational Definitions of Variables Included in the’ Accounting-Cholco Mode 








Variable >> ` Predicted Sign — 2 = Definition" 


AC ` (dependent variable) F DEEN 
z SS AE oo $4 EE 
Bi e b - apitelizer and zero otherwise. — 

LEV + | ` The leverage ratio computed by dividing the book 


valos af lone term deut by tha autiot the market valia 
of the equity and the book value of the long-term debt. ; 


(nSALE ` ES -> .. * . The natural logarithm of net sales. 


RDVAR ` + An index of the volatility of R&D expenditures, mae- 
= yo SÉ dee 

; fl 1973." 

NIVAR TE + 7 ` ` An index of the volatility of ee 


E 
ko a SC . to 1873. 

RDM ` + An index of the materiality of R&D expenditures, mea- 
ae So S ` sured by dividing the average of R&D expenditures 
(1970-1973) by the average net income after taxes. 


i The financial variables are converted to met dollars with tho Impl Pico Dier for GNP 

19882 = 100 

EE 
Beet l 


ined for the period 1975-1978 to identify firms that restated their prior years’ financial 
_ statements in compliance with SFAS No. 2. A total of 450 firms were identified as SES 

- capitalizers. 

To be included in the experimental sample, a firm must have switched to the ex- 
pensing method after November 30, 1974, and must be-available on the COMPUSTAT 
tapes; this reduced the sample to 121 firms—67 OTC firms and 54 listed firms. 

_ A control sample of 250 firms was randomly selected from all other firms on athe 
COMPUSTAT tapes that had adopted the expense method at least one year before 1974.. 
`- Of the 228 firms satisfying these Game 39 were OTC firms and 189 were ‘listed 
firms. ` l 


Definition of Variables and Data Description 


Tables 1 and 2 summarize the operational definitions of variables included in the 
accounting choice and R&D models, respectively. Table 3 reports descriptive statistics 
-of the relevant variables for the sample firms. These summary statistics suggest some 

. basic differences between the two groups. Capitalizing firms were, on average, smaller ` 
than expensing firms. The mean and the standard deviation of leverage, volatility of 
R&D and earnings, and materiality of R&D expenditures were larger for the capitalizing 
group than for the expensing group. In addition, capitalizing firms had lower cash-flow 
amounts and spent less on capital investment than expensing firms. ` l 
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Table 2 
Operational Definitions of Variables Included in the R&D Model* 





Variable . Predicted Sign Definition? 
RD (dependent variable) R&D expenditures incurred by each firm. R&D spend- 


ing before 1974 was collected manually from firms’ 
it 10-Ks and annual reports. R&D, spending after 1974 
. was obtained from the COMPUSTAT tapes. 


TA | ab l Total assets (net of intangible assets) as reported on the 
CH 2 COMPUSTAT tapes. 
TASQR ? The square of total assets (net of intangible assets). 


CF i + Cash flow available for each firm lagged one period. 
l : ` The cash flow is measured as the sum of net income 
before extraordinary items, depreciation, amortiza- 

tion and depletion, and change in current liabilities 
and noncash current assets from operations. l 


CAP i - - otal amount of capital expenditures incurred by 


DIV ZE "A Se An index of product diversification measured by the 
ei $ number of four-digit SIC industries in which the firm 
operates. The data on diversification were collected by 

hand from Moody’s Industrial and OTC manuals. 


e The financial variables are converted to constant dollars with the Implicit Price Deflator for GNP 
(1982 = 100). 

* The absolute dollar amount of the dependent and independent variables in the R&D model are scaled by 
their respective industry averages. The industry averages are computed for all companies available on the 
COMPUSTAT tapes ES using the two-digit SIC codes. 


V. Empirical Results 
‘Results of the Switching Regression Analysis 


_ Table 4 reports the estimates of the switching regression model. The probit esti- 
mates of the accounting-choice equation are presented in panel A, and the OLS esti- ` 
mates of the R&D equations (corrected for self-selection bias and for heteroskedasticity) 
are reported in panels B and C for capitalizing and expensing samples, respectively. 

The results in panel A indicate that the accounting-choice model has good overall 
explanatory power and high classificatory ability (87 percent of the firms in the sample 
are correctly classified). The estimated coefficients of all independent variables are sta- 
tistically significant and directionally consistent with prior expectations. In general, 
the firms that are more likely to prefer the capitalization method are small and highly 
‘leveraged, have high volatility of R&D expenditures and more variable earnings, and 
spend a significant portion of their income on R&D activities. 

A comparison of panels B and C indicates that the two groups of firms exhibit quite - 
different behavior with respect to R&D. The estimated coefficients in the capitalizing 
sample are larger in absolute value than those in the expensing sample. Moreover, the 
statistical significance and the signs of several variables differ. : 

The estimated coefficients of total’ assets (TA) and the square of total assets 
(TASQR) for the expensing group indicate that R&D is a positive linear function of size. 
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Table 3 


Summary Statistics of the Variables for Capitalizing and Expensing Firms: 
Mean and Standard Deviation in 1973 





Capitalizing Sample Expensing Sample i 
Listed OTC ; Listed OTC 
Variable Total Firms Firms Total Firms Firms 
no 121 54 67 298 189 = 39 
AC Equation: p 
LEV 0.9256 1.2221 0.6936 0.2493 0.2938 0.0553 
(1.3270) (1.6633) (1.0108) (0.4243) (0.3543) (0.5988) 
InSALE 3.3987 4.1143 2.6781 §.0784 5.3972 3.7550 
(1.3828) (1.3083) (1.0850) (1.6518) (1.6211) _ (0.9318) 
RDVAR §9.9520 60.7704 59.3614 21.1480 20.9483 21.5227 
(44.9210) (44.4533). (45.6003) (13.4781) (13.8123) (11.9362) 
NIVAR 43.0640 47.2312 11.7848 27.4131 874.2010 39.6839 
(34.2850) (31.2149) (29.6520) (19.3720) (3284.2752) (74.6722) 
RDM 0.0440 0.0227 0.0608 0.0219 0.0226 0.0187 
(0.1268) (0.0248) (0.1668) (0.0174) . (0.0176) (0.0167) 
R&D Equation: 
RD 0.5143 0.5433 0.4917 0.8381 0.9865 0.1926 
(1.1148) (1.1434) (1.0999) (1.4921) (1.6168) (0.2649) 
TA 0.3991 0.5631 0.2707 1.9914 1.6658 0.2057 
(0.7337) (0.9147) (0.5250) (3.0881) (3.3700) - (0.2157) 
CF 0.3321 0.5018 0.1992 1.2369 1.4647 0.2507 
(0.7064) (0.9673) (0.3539) (2.7059) (2.9586) (0.2808) 
CAP 0.3839 0.5562 0.2490 1.1737 1.4102 0.1528 
(0.8698) (1.2226) {0.3878) (2.5284} (2.7544) (0.1462) 
DIV 2.0569 2.6851 1.5652 2.6837 . 2.9874 1.4103 
(1.2887) (1.3711) (0.9774) (1.6800) (1.7025) (0.6373) 


Note: The standard deviations are in parentheses. 
LEV = debt-to-equity ratio, 
înSALE =the natural logarithm of net sales, 
RDVAR= an index of the volatility of R&D expenditures, 
NIVAR=an index of the volatility of reported earnings, 
RDM = an index of the materiality of R&D expenditures, 
RD=R&D expenditures, 
TA =total assets (net of intangible assets}, 
CF=cash flow, 
CAP=capital expenditures, and 
DIV=an index of product diversification. 


For the capitalizing sample, R&D funding increases proportionately with firm size up to- 
a certain point, then increases at a decreasing rate. The estimated coefficients of cash 
flow (CF) suggest that the availability of internally generated funds is important in ex- 
plaining R&D variation for the capitalizing firms, but not for the expensing group. The 
estimated coefficient of capital expenditures (CAP) is negative and significant for the 
capitalizing sample, positive and significant for the expensing sample. This suggests 











Note: The definitions of the variables are the same as in table 3. 
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Table 4 
Two-Step Estimates of the Complete Model 
Variable _ Coefficient Error t-ratio Level 
Panel A. AC Equation: | - 
Intercept - 0.340431 0.431160 —0.790 0.42978 
LEV 2.33445 0.340783 8.851 0.00000 
InSALE .—0.540357 0.899552E-01 6.007 . 0.00000 
RDM 8.62086 3.88917 2.217 0.02685 
NIVAR l 0.109898E-03 0.320855E-04 9.428 ' 0.00061 
RDVAR - 0.250880E-01 . '0.420688E-02 5.984 0.00000 
Chi-squared 225.24 | 
Log-likelihood — 100.7 
Cases correctly classified as capitalizers 89 (n=121) 
Cases correctly classified as expensers 218 (n=228) 
Panel B. R&D Equation (Capitalizing Sample}: | 
Intercept . 0.620488 0.155936 3.979 0.00007 ` 
TA 1.57976 0.279149 5.659 ` 0.00000 
TASQR + 0.577391 0.143192 —4.032 0.00006 
CF 1.83815 0.692464 2.655. 0.00794 
CAP — 0.760388 0.258129 —2.946 ` 0.00322 
DIV l ~— 0.237283 0.661936E-01 -3.584 0.00034 
Selectivity . ; 
_ variable 0.379689 0.175826 ~ 2.159 0.03081 
Adjusted R?=0.5125 ; 
Panel C. R&D Equation (Expansing Sample}: | 
Intercept | 0.264902 0.588020E-01 4.497 0.00001 
TA 0.271388 0.114328 2.374 0.01761 
TASQR —0.253627E-02 0.387717E-02 -0.654 0.51301 
CF 0.365116E-01 0.132560 0.288 0:78878 
CAP ` 0.272539 0.116944 _— 2.331. 0.01978 
DIV —0.475249E-01 0.259751E-01 = 1.830- ` ` 0.06730 
Selectivity l 
variable 0.831273E-01 0.378740E-01 2.195 0.02818 
_ Adjusted R?= 0.85608 | 


that although R&D and capital investments are competing decisions for the capitalizing 
group, they are complementary decisions for the expensing group. Contrary to prior ex- 
pectations, the estimated coefficient of product diversification (DIV) is negative and 
significant for both groups, indicating that highly diversified firms spend less on R&D. 

This finding suggests that R&D effort may.be more productive if concentrated in a few 

product areas (Kamien and Schwartz 1982). `: 

These results indicate that the two groups of firms are different in characteres as 
well as in the structure of their R&D investment function. Thus, it would not be correct 
to assume that the sample firms in the two groups are randomly selected from a homog- 
' eneous population, 
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. Table 5 
OLS Estimates of R&D Equations 
Standard Significance 

Variable Coefficient Error t-Ratio Level 

Panel A. Capitalizing Sample: | 

Intercept 0.5065 0.1400 3.618 - 0.0006 

TA 1.5204 0.2977 5.107 0.0000 

TASQR — 0.5787 0.0864 — 5.986 0.0000 

CF 1.8572 0.3068 6.054 0.0000 

CAP —0.7497 0.1983 ~ 3.781 0.0003 

DIV — 0.2592 0.0584 — 4,440 0.0000 

Adjusted R?=0.49379 

Panel B. Expensing Sample: ; 

Intercept 0.2346 0.0774 3.032 0.0029 

TA 0.2728 0.0532 5.127 0.0000 
` TASQR — 0.0027 0.0018 — 1.484 0.1325 

CF 0.0408 0.0680 0.615 0.5467 

CAP 0.2703 0.0485 5.808 0.0000 

DIV — 0.0451 0.0277 ~ 1,831 0.1001 


Adjusted R?=0.85630 


TA =Total assets (net of intangible assets). 
TASQR=The square of total assets (net of intangible assets). 
CAP=Ceapital expenditures. 
CF=Cash flow. 
DIV= An index of product diversification. 


The estimated coeffcient of the selectivity variable for the capitalizing sample is 
negative and statistically significant at the 0.06 level. This result suggests that average 
R&D expenditure for capitalizing firms with given measured characteristics is, ceteris 
paribus, likely to exceed what these firms would have spent under the expense method. 
Similarly, the positive and statistically significant coefficient of the selectivity variable 
for the expensing group indicates that the expensing firms are likely to spend more on 
R&D under the expense method, ceteris paribus, than they would have spent under the 
capitalization method. These results further indicate that the omitted variables that im- 
pact accounting choice are positively correlated with the R&D investment decision. 

For comparative purposes, the OLS estimates of the R&D equations without correc- 
tion for self-selection bias are reported in table 5. A comparison of the switching regres- 
sion estimates for the capitalizing group (panel B in table 4) with those obtained from 
OLS (panel A in table 5) reveals several interesting points. All estimated coefficients 
that are significant under OLS are also significant under the switching regression, and 
the signs are the same. With the exception of the CF and CAP variables, OLS consis- 
tently underestimates the positive coefficients and overestimates the negative coeffi- 
cients. The difference between the estimated coefficients under the two estimators is 5 
percent for TA and appears to be minor for TASQR, CF, and CAP. The coefficients of 
DIV, however, differ markedly. The OLS overestimates the negative effect of DIV by 9 
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Table 6 


Average Expected Value of R&D Expenditures if All Firms Were Using 
the Same Accounting Method Before SFAS No. 2 (1973 Data) 





Two-Step Estimates OLS Estimates 
All All 
Firms Capitalizers Expensers Firms Capitalizers Expensers 
(1) (2) (3) (4) (5) (6) 
E[RD \ Bc] 0.7604" 0.6999" 0.7928" 0.5907° 0.5269° 0.6245" 
E[RD \ 8s] 0.6897° 0.3898" 0.8595* 0.6897" 0.3698" 0.8596" 


E[RD \ &-]=The average expected value of R&D expenditures when assuming that all firms were capital- 
izers, where 8c is a vector of the estimated coefficients of the R&D equation for the capitalizing 
sample. 

E[RD \ 6.]=The average expected value of R&D expenditures when assuming that all firms were expensers, 
where 8; is a vector of the estimated coefficients of the R&D equation for the expensing sample. 

« The expected value of R&D is calculated with the estimated coefficients obtained from equation (4) (not 
including the estimated coefficient of the selectivity correction factor). 

* The expected value of R&D is calculated with the estimated coefficients obtained from equation (5) (not 
including the estimated coefficient of the selectivity correction factor). 

< The expected value of R&D is calculated with the estimated coefficients obtained from equation (2). 

‘The expected value of R&D is calculated with the estimated coefficients obtained from equation (3). 


percent. The largest difference for the capitalizing group is between the estimated con- 
stant terms. This difference alone indicates that the mean value of R&D expenditures is 
underestimated by 18.5 percent with OLS. These results suggest that OLS estimates will 
predict lower R&D investments after the implementation of SFAS No. 2 than will the 
switching regression. A comparison of the OLS and switching regression estimates for 
the expensing group provides similar findings. 

Table 6 reports predictions of the expected values of R&D expenditures for both 
expensing and capitalizing samples had they elected to be in the other group. These 
predictions are formed by using the estimated coefficients from both the switching 
regression and OLS estimates. If all firms used the same accounting method before the 
issuance of SFAS No. 2, the mean value of R&D expenditures would have been higher 
under the capitalization method (0.76) than under the expense method (0.69). The 
_ results further show that the mean value of R&D for each group is expected to be lower 


` under the nonpreferred alternative. The decline is more pronounced for the capitaliza- 


tion group (from about 0.69 to about 0.37) than for the expensing group (from about 0.85 
to about 0.79). These results support the argument that firms choose between account- 
ing methods on the basis of their own characteristics and the relative advantages of 
each method. 

A comparison of the mean value of predicted R&D expenditures shows that OLS 
= consistently provides lower predictions than those obtained from the switching regres- 
sion estimates. Contrary to the prediction of the switching regression model, OLS 
predicts lower mean values of R&D expenditures under the capitalization method (0.59) 
than under the expense method (0.69) if all firms had used the same accounting method 
before SFAS No. 2. By contrast, OLS predicts higher mean values of R&D for each 
group under its chosen accounting method than under the alternative method. How- 
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ever, the OLS overestimates the expected reduction in the mean value of R&D of 
expensing firms if they were forced to capitalize R&D (from 0.86 to 0.62) as compared 
with the predictions of the switching regression (from 0.88 to 0.76). 

To summarize, the switching regression analysis indicates the presence of selection 
bias in both groups. Hence, correction for this bias is important in the assessment of the 
effects of SFAS No. 2 on R&D. The switching regression model predicted potential de- 
cline in capitalizers’ R&D in response to SFAS No. 2. 


Results of Testing for Structural Changes in the R&D Model 


Table 7 presents some descriptive statistics and the regression estimates of the R&D 
equations for both groups during the period 1975-1978. The two groups of firms exhibit 
different behavioral structures with respect to R&D. As reflected in the estimated 
coefficients of the R&D model, these differences continue to exist after the elimination 
of the accounting-method choice. The fact that the estimated coefficients of the R&D 
equation for the capitalizing group did not approach those of the expensing group 
might be caused by the mandatory nature of the accounting change. That is, forcing 
capitalizers to expense R&D does not immediately eliminate structural differences be- 
tween the pre-SFAS No. 2 capitalizers and expensers. 

Table 8 reports the results of using the Wald statistic to test for structural changes 
in the R&D model during 1975-1978.° The R&D function of capitalizing firms (panel A) 
seems to have undergone structural changes during that period. However, these 
changes, as reflected by the regression slopes, cannot be attributed to SFAS No. 2 be- 
cause they could have been the outcome of macroeconomic factors. This possibility 
seems to be confirmed by tests for structural changes that were also conducted for the 
expensing group. As shown in panel B of table 8, the expensing firms also exhibited sig- 
nificant structural changes during the period 1975-1978. In this case, the average im- 
pact of SFAS No. 2 on capitalizers’ R&D expenditures can be assessed by contrasting 
the average prediction error for the capitalizing group with the average prediction error 
for the expensing group. The average prediction errors for both groups during the 
period 1975-1978 are calculated with the estimated coefficients of the R&D model 
corrected for self-selection bias in equations (4) and (5), and are reported in columns 1 
and 2 of table om The average prediction errors for the expensing group during the 
period 1975-1978 are consistently higher than those of the capitalizing group. This 
finding suggests that, aside from the effects of economywide changes, capitalizing 
firms seem to have reduced their relative R&D expenditures following the implementa- - 
tion of SFAS No. 2. E 

For comparative purposes, the post-SFAS No. 2 analyses presented above are re- 
peated with the OLS estimates without corrections for selectivity. The results of testing 


° An alternative test for structural change is the Chow (1960) test. The underlying assumption is that the 
variances of the error terms in the two regression equations are equal. Toyoda (1974) shows that if this assump- 
tion is violated, the Chow test may be inaccurate. To examine the equality of the error variances, the researcher 
performed an F-test on the standard errors of the regression equations in the two periods (see Maddala 1988, 
136). The results indicate that the error variances are not equal in the two sample periods. Honda (1982), based 
on Watt (1978), suggests that the Wald test (Wald 1943), which is asymptotically distributed as chi-squared, is 
the best alternative in the case of unequal error variances, particularly in the case of large sample size, 

10 We assume that the macroeconomic change affects both groups of firms in the same way, although the 
estimated coefficients differ in magnitude and in some signs. Therefore, these results should be interpreted 
with caution. 
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Table 7 
Descriptive Statistics and Regression Estimates of the R&D Model 
During the Period 1975-1978 l 
1975 1976 1977 1978 
Panel A. Capitalizing Sample: 
Descriptive Statistics 
N 112 169 98 87 
RD 0.3399 0.2746 0.3131 0.2966 
(0.714) (0.648) (0.684) (0.599) 
TA 0.4106 0.3800 0.4068 0.4008 
(1.190) (0.444) (1.275) (1.267) 
CF 0.3315 0.2248 0.3506 0.3526 
(1.302) (1.303) ` (1.178) (1.154) 
CAP 0.3694 0.6681 0.4074 0.4244 
(1.277) (0.703) (1.737) (1.508) 
DIV 2.0583 1.9417 2.175 2.1833 
(1.298) (1.324) (1.351) (1.347) 
Estimated Coefficients 
Intercept 0.1566 0.2633* 0.4178* 0.4131* 
TA 0.9358* 0.8778* 0.7544*** 1.2582* ` 
TASQR —0.0495*** —0.0739** — 0.0949" ~ 0.0751" 
CF 0.6370** 0.3585 1.0197*** 0.1752 
CAP — 1,0035* — 0.3487 — 0.5467* ~~ 0.5847 ** 
DIV 0.0178 — 0.0853 —~ 0.1578" ~ 0.1426" 
Adjusted R? 0.3905 0.28635 0.6031 0.5582 . 
Standard error of regression 0.5580 0.5778 0.4598 0.3905 
Panel B. Expensing Sample: 
Descriptive Statistics 
N 219 211 207 198 
RD 1.1543 1.1474 1.1083 1.0731 
(2.879) (2.818) (2.810) (2.752) 
TA 1.3887 1.4090 1.4057 1.3990 
(3.162) (3.243) (3.262) (3.244) 
CF 1.2369 1.2874 1.2864 1.3008 
(2.823) (2.843) (2.824) (2.943) 
CAP 1.2740 1.2951 1.2587 1.2682 
(2.989) (3.276) (3.087) (3.072) 
DIV 2.6788 2.7143 2.8724 2.8724 
(1.671) (1.748) (1.712) (1.713) 
Estimated Coefficients , 
Intercept — 0.1579. — 0.0598 — 0.0536 ~ 0.0687 
TA 0.3985* 0.1996 0.2433 0.5032* 
TASQR — 0.0061 — 0.0035 — 0.0025 — 0.0089 
CF .0.6182* 1.1224* 0.8234* 0.6635* 
CAP- 0.0171 — 0.3143 — 0.1012 — 0.1744 
DIV 0.0178 — 0.0250 —0.0292 — 0.0410 
Adjusted R? 0.8533 0.8743 0.8681 0.8712 
Standard error of regression 1.1172 1.0118 1.0334 0.8878 





The numbers in parentheses are the standard deviations. 
* Significant at the 0.01 level. 
** Significant at the 0.05 level. 
*** Significant at the 0.10 level. 


TA=Total assets (net of intangible assets). 


TASQR= The square of total assets (net of intangible assets). 


CAP=Capital expenditures. 


CF=Cash flow. 


DIV=An index of product diversification. 
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Table 8 


Results of Testing for Structural Changes 
in the R&D Model with the Wald Test* 








Two-Step Estimates OLS Estimates 
Total Slopes Intercepts Total Slopes Intercepts 
Years (1) (2) (3) (4) (5) (8) 
Panel A. Capitalizing Sample (chi-squared): ` 
1975 84.18* 83.57* 0.45 86.85* 83.39* 2.59*** 
1976 45.11* 44.49* ; 0.53. 24.42* 22.53* 1.76 
1977 56.48* ` 55.14* 0.69 56.99* 55.12* 1.53 
1978 62.16* DEER 0.73 64.91" 61.31" 2.91*** 
Panel B. Expensing Sample (chi-squared): 
1975 184.64* 154.28* 0.40 162.66* 154.64* §.91** 
1976 217.12* ,217.37* 0.41 225.72* 217.56* 5.40** 
1977 162.97* 162.08* 0.41 167.50* 162.45* 3.66*** 
1978 168.35* 168.29* -0.486 168.87** 168.68* 3.01*** 
e The results are corrected for heteroskedasticity with White’s correction procedure. 
* Significant at the 0.01 level. 
** Significant at the 0.05 level. 
*** Significant at the 0.10 level. 
Table 9 
- Average Prediction Error for Capitalizing and Expensing Samples 
Two-Step Estimates OLS Estimates 
Capitalizing ~ Expensing Capitalizing Expenaing 
Sample ` Sample Sample Sample 
Years (1) _ (2) (3) (4) 
1975 — 0.1740 0.2856 — 0.020 0.3056 
1976 —0.1762 — 0.2668 — 0.024 0.2866 
1977 — 0.1545 0.2443 0.083 0.2639 
1978 — 0.1606 0.2116 —0.012 0.2308 


for structural changes in the R&D equations for both groups are also reported in table 8, 
and the average prediction errors for both groups are reported in table 9. The OLS 
‘results also show that both groups of firms exhibit significant structural changes in 
their R&D functions. 

The OLS tests in table 9 show reductions in the RGD activities of capitalizing firms, 
compared with the expensing firms. Although the magnitude of the prediction errors 
are significantly different than those provided by the switching regression, they essen- 
tially lead to the same conclusions. 
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VI. Gamer and Conclusions — 


Using the case of SFAS No. 2 as an example, this study raises concerns regarding 
self-selection bias in empirical studies that examine the economic. consequences of 
mandatory accounting changes. Indeed, the results of the switching regression analysis 
indicate the presence of selection bias in both capitalizing and expensing groups, 
which is also evident in systematic differences between OLS and switching regression 
results. The. Wald test results show significant structural. changes in the R&D model 
for both groups of firms, which indicates that some of the change in the capitalizers’ 
R&D spending behavior after 1974 may be attributable to general macroeconomic 
phenomena. After controlling for the effects of economywide changes, the analysis 
shows the incremental effect of SFAS No. 2 on the R&D expenditures of former capital- 
izers. 

It should be recognized that, although the two-step estimator is consistent, it is not 
efficient. Heckman (1976) points out that efficiency can be gained by employing maxi- 
mum likelihood estimation techniques, by using the results of the consistent estimator 
as an initial value for the iterative procedures. Wales and Woodland (1980) suggest that 
the maximum likelihood estimator is consistent and asymptotically efficient. However, 
the use of this estimator in this study was not possible, as the data i in the experimental 
group did not converge. 

Given the absence of an adequate theory explaining the accounting choice and, 
hence, the ad hoc nature of the accounting-choice model, the ability to make correct in- 
ferences from the switching regression results is crucially dependent on the correct 
specification of the variables included in the accounting-choice model. Consequently, 
the results of this study should be interpreted with caution. 

Additional research is needed to examine this research method in different ac- 
counting-choice areas to assess its validity in accounting research. A possible applica- 
` tion of this method is to examine the economic consequences of SFAS No. 19 on oil and 
gas accounting. The switching regression method could also be used to examine the 
economic motivation for making voluntary accounting changes, such as switching 
from FIFO to LIFO, switching from full-cost to successful efforts, and changing depre- 
ciation methods. 

Finally, the analysis described in this study is applicable only to cases of dichoto- 
mous choice. Lee (1983) extended the analysis of dichotomous choice to the general 
case and developed estimation techniques applicable to the case of polychotomous 
choice. 
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L Introduction 


| HE purpose of this ‘article is to review the methodology of models involving 
qualitative and limited-dependent variables and to discuss their application in 
accounting research. Some limitations of their use in existing research will 
be pointed out, along with suggestions for improvement. The specific models that will 
be discussed are: (1) the logit and probit models and discriminant analysis, (2) the tobit 
model and the truncated regression model, (3) models involving sample selection bias, 
l and Li models involving self-selectivity. 
The article discusses the relationship between the logit model, the linear probability | 
model and discriminant analysis, problems of estimation from choice-based samples, 
and some problems arising in the use of tobit models and self-selection models. - 


IL Logit, Probit, and Discriminant Analysis 


Many studies are concerned with the determinants of choice of accounting 
methods, firm failure, bond ratings, lobbying before the FASB, auditors’ decisions, and . 
s0 on.! In this case, the data consist of firms belonging to two groups. The dependent 


$ Noreen (1988, n. 1) lists close to 20 studies, mostly in accounting journals covering these areas. Stone and 
Rasp (1991) list, in their Appendix, 19 accounting studies using logit or probit, along with the sample sizes, the 
number of explanatory variables, and the objectives of each study. Not all studies attempting to explain ac- 
counting choice use the logit or probit models. An example in this category is Morse and Richardson (1983). 
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variable is a dichotomous variable defined by: 
y,=1 “if the ith firm belongs to group 1, and 
guf if the ith firm belongs to group 2. 


One can ignore the dichotomous nature of the dependent variable and run a regres- 
sion of y, on the explanatory variables x,. If we write the mode in the usual regression 
form: 


Pre with E(u,)= 0, 


where x, is the vector of explanatory variables, we have the linear probability (LP) 
model. The conditional expectation E(y,|x,) is equal to 8’x,. This has to be interpreted 
as the probability that the event will occur given x,. The calculated value y,=8’x, will 
give us the estimated probability that the given event will occur given the particular 
value of x,. In practice these probabilities can lie outside the admissible range (0,1). 

The LP model has several shortcomings (see Maddala 1983, 15-16). However, it is 
of interest because the estimates of the regression parameters $ are proportional to the 
coefficients obtained by using discriminant analysis. In discriminant analysis we try to 
find a linear function of the explanatory variables that best discriminates between the - 
two groups. The linear function is chosen in such a way that it has the maximum be- 
tween-group variance relative to within-group variance. If à are the coefficients of the 
discriminant function, then we have: 


7 wa 








where Ê are the estimates of the coefficients 8 in the LP model, RSS is the residual sum 
of squares from the LP model, and N is the number of observations: Thus; once we have 
the regression coefficients and the residual sum of squares from the LP model, we-can 
easily obtain the discriminant function coefficients. (For further discussion, see Mad- 
dala 1983, 20-21.) An examination of the review by Zavgren (1983), Jones (1987), Casey 
et al. (1986), and other studies on discriminant analysis in the accounting and financial 
literature, shows that this point is perhaps not well known. - 
An alternative approach is to assume that there is an underlying latent variable y* 

and that the observed variable y, is related to y* through the relation: 


| yi=1  ifyř>0 and y=0 ify?<0. . 
The regression relationship is now defined in terms of the latent variable: 
| yi =p’ Xi: +, | 


where u, are independent and identically distributed random variables with mean 0. 
If u, has the normal distribution, we have the probit model. If u, has the hyperbolic 

sec? distribution, then we have the logit model. Other distributions of ù are possible but 
are not often used. The likelihood function for this model is given by: 


L=][0-F(- "ësit: Un: B’x,), 


y=1 


where F(*) is the distribution function of u. For the probit model, there is no closed 
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form expression for Ft: L but fast algorithms are available to evaluate it. For the logit 
model, F(—8’x,) simplifies to 1/[1+exp(8’x,)], and, thus, we have a closed form ex- 
pression. The maximum likelihood (ML) estimates from the logit and probit model can 
be obtained by any of the popular canned programs like SAS, TSP, RATS, and so on. 

There is usually not much to choose from between the logit and the probit models, 
except that, when it comes to the analysis with matched samples (bankrupt firms 
matched with non-bankrupt firms considered similar in all other respects), the logit 
model is more convenient. This point will be explained later when we discuss estima- 
` tion from choice-based samples. The relationship between the coefficients obtained 
from the logit, probit, and LP models, and the calculation of the changes in Prob( y,=1) 
for given changes in the explanatory variables, are given in Maddala (1983, 23) and will 
not be repeated here. 

One point worth noting here concerns the terminology logit and conditional logit. 
The model we have discussed (the one used in the problems. on accounting choice, 
bankruptcies, and so on) is known as the logit model. This model was first discussed by 
Berkson (1944), who advocated the logistic transformation instead of the cumulative. 
normal, which had been used for a number of years because it was easier to. compute. 
_ The term “conditional logit” was introduced by McFadden (1973) to denote a different 
(and in some casés, a more general) model that is derived From a random utility model 
based on specific attributes of the different choices. This model permits predictions of 
probabilities of choice when a new choice is introduced (as in the case of marketing 
when a new brand not analyzed by the data is introduced). It is not relevant for the 
problems in accounting that we are dealing with. 

The difference between the logit model and McFadden’s conditional logit model is 
discussed in Maddala (1983, 42-44). This distinction is important for the computer pro- 
grams used, in the analysis of sample-selection biases, and in the discussion of the rela- 
tionship between discriminant analysis and logit analysis. McFadden’s conditional 
logit needs a computer program that is more general than for the usual logit analysis. 
Though this program can also be used for the usual logit analysis, it involves a redefini- 
tion of the variables and is cumbersome. Ohlson (1980), for instance, uses the terminol- 
ogy “conditional logit” and lists McFadden’s 1973 article in the references, but whathe — 
means is the usual (Berkson’s) logit. Lo (1986) uses the MLOGIT program developed by 
Bronwyn H. Hall in his analysis of corporate bankruptcies and has to do some data 
manipulations, but the problem is one for which the usual logit program in any of the 
readily available, canned packages would do. The same is true of Palepu (1986) who, in 
his study on predicting takeover targets, uses the QUAIL program, which is designed to 
handle McFadden’s conditional logit, whereas any of the usual computer packages 
would do. 

Another terminology that occurs in the accounting literature is “multivariate prob- 
it,” introduced in Lee and Hsieh (1985) regarding choice of inventory-accounting meth- 
ods. What the authors mean is the usual probit model. The term “multivariate probit” 
often refers to cases with several latent variables, each of which is observed as a dichot- 
omous (or qualitative) variable, and the ae distribution is multivariate normal. 


Logit Versus Discriminant Analysis 


While the logit and probit models specify the sender distribution of y given X 
(the explanatory variables), discriminant analysis begins with the conditional distribu- 
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tion of X given y. Interestingly, if y is dichotomous, that is, there are two populations, 
and X follows a multivariate normal distribution, the implied form for P(y|X) is the 
same as that for the logit model. However, logit analysis is valid under more general 
distributional assumptions about X than those implied by discriminant analysis. The 
merits of using the logit model relative to discriminant analysis are discussed in Mad- 
dala (1983, 21 and 81-86) and will not be repeated here.” To summarize the discussion, 
if the explanatory variables are normally distributed, then one should use discriminant - 
analysis because it is more efficient than logit analysis in this case. However, if the ex- 
planatory variables are not normally distributed, then discriminant analysis gives in- 
consistent estimates, and-one is better off using logit analysis. Ohlson (1980, 129) uses 
the logit model and says that when he used discriminant analysis, he got worse results 
in terms of prediction. This is one method of choosing between the two models. Lo 
(1986) suggests a Hausman type epee eon: error test for choosing between the two 
methods. -> 
Let us define the two hypotheses: 


Hə Xs are multivariate normal, 
H:: They are not. 


Let Ê» and &, be the estimates of £ obtained by using discriminant bidr and logit 
. analysis, respectively, and let V» and V; be their covariance matrices. Ê» is consistent 
and efficient under He, but not under Hy. ĝz is consistent under both Ho and H, but 
inefficient under Hy. The Hausman statistic is: 


' T=Nq’[Vi- -Vo]q, which 1s xê, 


where q= sf Ê», k is the number of parameters in 8, and N is. the sample`size. Lo cal- 
culates this statistic, finds it not significant, and, hence, argues that discriminant analy- 
sis is justified. Ohlson, who was more careful with the data, came to the opposite con- ` 
clusion based on his predictive test. It would be ee to see what the aes 
test shows when applied to Ohison’s data. 

- "The calculation of the Hausman test statistic as described by Lo is complicated. A 
danla method is-to note, as mentioned earlier, that the coefficients from the LP model 
and from the discriminant analysis are proportional. Hence, we can implement the test 
by replacing Ê» by ĝzp. But 8» and Vun can be obtained by just OLS. As for 8, and Vi, . 
they can also be obtained from the usual canned programs for the logit model. Note that 
care should be taken that Å; and 6,» are comparable (and so also V, and Vr). For this 

, we use the relationships described in Maddala (1983, 23), which is to omy 
CR by four and Vz» by 16 to make. them comparable to 8, and Vz. 


_ The Question of Small-Sample Biases i in Tests of Significance 


. Noreen (1988) argues that, for the sample sizes often encountered in Se 
studies, the OLS regression (the linear probability model) seems to perform at least as 
well as the probit model. However, he is concerned with only the small-sample perfor- 
mance of significance tests for the estimated coefficients, The EE is deg the tests 


2 Martin (1977) also discusses the relationship between the logit model and discriminant analysis. Law- 
rence (1983) GES the importance, of reporting delays in the statistical analysis of SE be dis- 
criminant or logit anal 
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used on the probit model reject the null hypothesis too frequently for'sample sizes of 
50-100. In other words, a coefficient that is considered significant at the 5 percent level 
is not really significant at that level. However, looking at the figures in his table 2 (1988, 
125), one sees that the rejection rate for the null hypothesis for the individual coeffi- 
cients is actually lower for the probit model than for the OLS. It is only in the tests of the 
overall model that the significance tests based on the probit model reject the null hy- 
pothesis too often. 

Tests of significance for the individual coefficients is only one aspect of the prob- 
lem in these studies. Other aspects, like prediction and classification, are as important 
and need to be investigated before judging the small-sample behavior of probit versus 
OLS methods. As mentioned earlier, Ohlson (1980) uses the logit model and discrimi- 
nant analysis (which, as argued earlier, is the same as the LP model) and got worse re- 
sults in prediction with the discriminant model. Stone and Rasp (1991) also argue that 
experimentation with data on auditors’ Statement No. 87 consistency judgments indi- 
cates that OLS is worse than logit when it is used for prediction or classification. They 
conclude that even for sample sizes as small as 50, logit rather than OLS may be the 
preferable model for accounting-choice studies. 

That many of the test statistics suggested by the asymptotic theory do not often 
have the presumed significance levels when sample sizes are small is well known. 
These test statistics have been derived by first-order approximations, and it has been 
argued that, for small sample sizes, a “second-order” correction should be.used, as sug- 
gested in Rothenberg (1984). However, these corrections have been rarely used in prac- 
tice. 

Very often the issue is whether a particular explanatory variable is considered an 
important determinant of the dependent variable (say the LIFO-FIFO decision). If one is 

using the 5 percent significance level to judge this, it should not really affect the conclu- 
-~ sions if the true significance level turns out to be slightly higher. In some studies, as in 
Lee and Hsieh (1985), where the issue is the comparative analysis of alternative hypoth- 
eses, any biases in the significance levels should affect all the variables and, thus, 
. should not affect the conclusions (of course, in their study the sample size of 799 is 
large). Finally, given availability of the current computer programs, it is as easy to esti- 
mate the logit and probit equation as the OLS. Thus, Ge considerations are 
no longer a reason for choosing. one or the other. d | 


` The Problem of Choice-Based Samples - 


One of the common issues in the analysis of discrete State is that of unequal sam- 
pling rates from the two groups. This problem has been addressed by, for instance, 
Palepu (1986), Dopuch et al. (1987), and McNichols and Dravid (1990). 

Consider a population of N firms consisting of N, in group 1 and N; in group 2 (tar- 
gets vs. non-targets, adopters of LIFO vs: FIFO accounting, and so on). We consider a 
sample of n firms. Instead of drawing a random sample, we draw a choice-based sam- 
ple, n, from group 1 and nz from group 2 (n,+n.=n): Typically, n, and nz are ‘set equal. 
In the case of bank failures, N,, the number of failed banks, is much smaller than Nz, 
the number of non-failed banks. So, n, is chosen to be N, (all failed banks are included 
in the sample) and n: banks are chosen from N: non-failed banks. 

Suppose we use this choice-based sample to estimate a logit model to study the 
determinants of the relevant choice (the LIFO-FIFO decision or takeovers or bank 
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failures). What is wrong with this procedure? Because of the unequal sampling rates, do 
we need to use a weighting procedure? The answers to these questions are that there is 
nothing wrong with the logit analysis, and one does not need to use a weighting proce- 
dure. The coefficients of the explanatory variables are not affected by the unequal sam- 
pling rates from the two groups. It is only the constant term that is affected. This result, 
first derived by Anderson (1972) is discussed in Maddala (1986 and 1983, 90-91). 

. Suppose the dummy variable is defined as: 


yı=1 ifthe observation belongs to group 1, and 
=Q otherwise. 


Let p, and p, be the proportions sampled from the two groups. Then, after estimating 
the logit model from the choice-based sample, the constant term needs to be decreased 
by log pı —log pı. (On page 91 of Maddala [1983], the word “increased” should be "de. 
creased”). The other coefficients are unaltered, and the standard errors also remain 
valid, as shown in Prentice and Pyke (1979). This result cannot be derived for the probit 
model or the linear probability model. This is one reason why the logit model is prefer- 
able to either the probit model or discriminant analysis in these areas. 

Palepu (1988) talks about studies by Cosslett, Manski, and Lerman,” and so on and 
the biases that arise from the estimation of the logit model from choice-based samples. 
These biases arise in the case of the McFadden’s conditional logit model that includes 
attributes of the choices, and where p, and p: are not known but have to be estimated 
from information in auxiliary samples. Also, these studies consider the case where the 
xs are stochastic and thus a distribution g(x) of x is specified. In the simple logit model 
that is used in all the accounting studies, there are no biases except in the constant 
term, and this too is known in magnitude if p, and p, are known. In fact, Palepu (1986, 
21) later notes that only the constant term changes (by a known magnitude). 

Dopuch et al. (1987), consider the probit model and here things are not as simple as ` 
in the logit model. They talk of the Manski and Lerman WESML (weighted exogenous 
sampling) maximum likelihood) estimator that maximizes the weighted log-likelihood. 


Dën: ~8(8'x,)], 








where 
Ga SES in the population belonging to group 1, and ` 
N, SCH 
Oe = proportion in the sample belonging to group 1. 





Though this method gives consistent estimators, they are not efficient. A better 
procedure, which is not more difficult computationally, is the CML (conditional max- 
imum likelihood) procedure suggested by Manski and McFadden. Amemiya and Vuong 
(1987) show that the CML is asymptotically more efficient than the WESML. Cosslett 
proposes improvements for both these estimators, but these are more difficult to com- 


3 These are reviewed in Hsieh et al. (1985). 
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pute. In the simpler problems in the accents literature, where «, and. a, are assumed 
_ known, the CML estimator should be used. The appropriate likelihood function for the 
CML estimator can be easily derived, and it does not result in a simple weighting of the — 
log-likelihoods. To see what is involved, as before, let p, = =n,/N , and p,=n,/N, be the 
two sampling rates for the two groups. Also let p=p./p, and €,=0(8' Sc ER since 
Prob( y,=1)/Prob(y,=0)=p.%,/p.(1~®,)=®,/p(1—®,), we get: 


Prob(y:=1)=%,/D and  Prob(y,=0)=p(1—%,)/D, 


where D=,+p(1—,). | 
The log-likelihood function is, therefore: 


Lis Lia: Lian: ®,)+ Sieg 2 Ln[$.+ pli -8 


The first two terms are the ones for the standard probit model. The third term does 
not involve any parameters. The last term is the additional term arising from choice- 
based sampling. Note that if p=1 or pı = pz, then this term is zero, and we are back with 
the usual probit model. Maximization of this log-likelihood function is no more compli- 
. cated than the one with the WESML method considered by Dupoch et al. Also note that 
we do not get a simple weighting of the respective log-likelihoods in the two regimes. 
We shall see in the next section, that with choice-based sampling and the tobit model, as 
considered by McNichols and Dravid (1990), again a better procedure is not the 
WESML, which depends on weighting the respecting log-likelihoods by fixed weights, 
but an analog of the CML method, which depends on weights that are functions of SS 
b sample observations, ` 


Ill. The Tobit Model and the Truncated Regression Model 


The tobit model is a censored regression model where observations on the depen- 
dent variable y, are censored (or unobserved) if y,<c. The explanatory variables, how- 
ever, are observed for all i. In the truncated regression model, by contrast, neither the 

_ dependent nor the explanatory variables are observed if y,<c. This distinction is im- 
portant in, for instance, studies on the costs of resolving bank failures. 

_ Consider a latent variable vz defined as: 


yř=ß'Xxı+uı u,~IN(o, g’), 
The dinerged y, is related to y* by the relation: 
yix=y? if y¥>c, 
=c otherwise, . 


where c isa DEER constant (usually zero). The paramigtera 8 and o can A 
estimated LOY maximizing the likelihood function: 


p= [Jo fee =i mk Tat 


yi>e gO yı=0 


where éi, ) and ët, } are the density function and the distribution function of the stan- 
dard normal and z,=(c—f8’x,)/o. The tobit programs to do this are now available with 
all standard packages such as TSP, SAS, RATS, and so on. | 
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For the truncated regression model, note that all the observations are drawn from 
the truncated distribution y,>c. Since Prob(y,>c)=1—(z,), we have to divide the 
density function for y, by this normalizing constant (to make the total probability equal 
1). Hence, the likelthood function to maximize for maximum likelihood estimates of 8 


and o is: 
KEN E J-atai 
"Te o | 


For the tobit model, using the OLS method with only the data on positive expenditures 
results in biased estimates SES E(u,) for these observations is not zero. This expec- 
tation is given by: . . 


Pio, seggt H? 


Wee can think of using just the non-limit observations y,>cand l estimating the equa- 
tion: 


$(z.) 
1—(z,)- 


by OLS Gees E(u,|y:>c) has been included as an additional explanatory variable ` 
and now E(v,)=0. This suggestion for correcting the bias arising from the “omitted 
variable” $(z,)/[1~®(z,)] is known as Heckman’s correction for selection bias and was 
suggested by Heckman (1976, 1979). 

There are three main problems with this procedure. The first is that z; is not known 
and has to be estimated. This can be done by estimating the probit model based on the 
indicator function: 


Lei ify*>c, and 4I,=0 if y* sc. 


The second problem is that of obtaining the correct standard errors for the OLS esti- 
mates, given that we have an estimated regressor (the omitted variable). The third prob- 
lem is that the errors v, are heteroskedastic. Var(v,)=Var(u,|y,>c) is a function of z,. 
Given all these problems and the availability of easily computable programs for the 
tobit model, the above two-step procedure is not recommended. Its purpose is in point- 
ing out the nature of the bias that arises from the use of OLS with censored data. 
-One important point to note is that, for the truncated regression model, there is no 
way one can apply the Heckman correction. This is because all observations come from 
the group y,>c, and there is no way one can define the dummy indicator variable I, and 
estimate the probit model. James (1989) talks of a truncated regression model and the 
Heckman correction, but obviously the model estimated was not a truncated regression 
model. 


Inappropriate Use of the Tobit Model 


It is tempting to use the tobit model every time one has a bunch of Zero ion other 
limit) observations on y. This is clearly inappropriate. In fact, there are many more ex- 
amples of the inappropriate use of the tobit model.than of its correct use. In the tobit 
model, y* can be less than c, but these observations with y*<c are not observed.be-. 
cause of weg The limit observations arise because of non-observability.. In actual’ 


"wi 
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_ practice these limit observations often arise more by individual choices. For instance, in 
the case of automobile expenditures, the zero observations arise because some indi- 
viduals choose not to make any expenditures. It is not the case that they are negative 
and we substitute zero for y* because of non-observability. Similarly, it is inappropriate 
to model hours worked by the tobit model. There is no way hours worked can be nega- 
tive, the zeros are a consequence of individual choices. In this case, the proper proce- 
dure is to model the choices. (See Maddala 1985, 4-6.) 

Barth et al. (1989), estimate a tobit model with the resolution cost of the thrift insti- 
tution as the dependent variable. This is positive for those institutions that are resolved, 
but since Barth et al., included healthy as well as resolved institutions in the data set, it 
is meaningful to use the tobit model because we can assume negative resolution costs 
for the healthy institutions even though these are not observed. Barth et al. (1989, 27), 
exclude from their analysis unresolved institutions that are GAPP-insolvent because, 
for these institutions, the costs are positive but not observed, and the dependent vari- 
able is missing, not because of censoring as for the healthy institutions. Thus, in this 
case, the tobit analysis is valid. 

One other limitation of the tobit model as applied, for instance, in See of 
bankruptcy resolution, is that it implies that the decision of whether to pay and how 
much to pay are both determined by the same latent variable y*. To overcome this limi- 
tation, Cragg (1971) suggested a model that permitted a different set of regressors to af- 
fect the probability of a non-limit observation. This means that the probit model deter- 
mining whether to pay and the model determining the expected payout have different 
parameters (with the same or different regressors). Cragg’s model treats the two 
decisions as sequential. A joint decision model where the decisions of whether to pay 
and how much to pay are jointly determined can be handled with the procedures 
outlined in section 6.11 of Maddala (1983). Some other alternatives are discussed in 
Blundell and Meghir RES 


Choice-Based Sampling in the Tobit Model 


| As in the case of the logit model, if we sample the two groups with yi>candy,scat 
different rates; then this should be taken into account in the estimation. For instance, in- 
the analysis of the determinants of stock-split factor by McNichols and Dravid (1990), 
there were 1,308 observations in the sample with stock dividends (SD) and 810 observa- 
tions in the non-SD sample. However, approximately only 10 percent of firms listed on 
the CRSP Daily Returns tapes split their shares in any given year in the 1976-1983 
period. Thus, the SD group was sampled at a rate approximately 15 times that of the 

. non-SD group. To account for this, McNichols and Dravid used the WESML method of 

| Manski and Lerman and weighted the log-likelihoods of the SD and non-SD samples, 
respectively, by 1 and 15. However, they found that the results were insensitive to the 
weighting scheme applied (0.1 for SD and 0.9 for non-SD and equal weights, which is 
implied by the usual tobit model). 

Again a better procedure to use is the CML method. To derive the analog of the 
CML procedure of Manski and McFadden for the tobit model, consider the case where 
the limit is 0, as in the case of stock splits. Let p, and p2 be.the proportions sampled 
from the two groups: y,>0 and vest, Then the densities in the two ETS should be 

ae “weighted. by Wu and Wz, respectively, where: 
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Di — Pı ; 
pe Prob{y,>0)+p2°Prob(y,s0) p,%,+p.(1-®,) . 


E E and nat? =) ` 
pi®,+p.(1—4,) o 
Since it is just the ratio p2/p, that matters, let us define p=p./p,. Then we get: 
1 p : | 


Wa= 


W = — and m 
@+p(i-®) ` $,+p(1-®.) | 
ER likelihood function for the tobit model with choice-based sampling will be: 


L= IR o a $ fa [[w2[1- ®,]. 
yi>0 g. Ké 


The log-likelihood for this model will therefore be given by: 
| Log L= (Log-likelihood for the usual tobit model) + 2log p 


Wu= 








_Dlogis,+p(t- &,)]. 


In the usual tobit model, p=1 and the last term vanishes. Note that since ®, involves 
the parameters 8 and o, the ML estimates will, in general, be different from those of the 
usual tobit model. Also note that it is not a case of weighting the log-likelihoods in the 
two regimes with constant weights, as done in McNichols and Dravid (1990). The 
weights Wa and Wz depend on (8’x:/0). 


IV. Self-Selection Models 


The we of selection” bias arises whenever there is non-random sampling. 
When this non-randomness arises from individual choices, it is customary to talk of 
self-selection bias, when by the way the investigator designs the sample, it is customary 
to talk of sample-selection bias. In spite of slight differences in the terminology, the ėsti- 
mation methods are essentially the same. Selection bias refers to the bias in the esti- 
mates obtained by following the usual procedures of estimation that ignore the non-ran- 
domness of the samples. Though the estimation procedures are the same, it is important 
to note that the conclusions one draws would be different in the two cases. 

In the following situation (the notation is as in Maddala 1983, 223-26), we have two 
groups with behavioral relations: 


Vu=8; Kit Hie and 
| | Yu=ßbi Xut Uu. 
There is an indicator function: | 
It=y' Si Dee , ; 
for which we observe a dichotomous indicator I, defined as I,=1 if I7 >0 and L=0if 
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~ I# <0. We observe that the ith individual belongs to group 1 if I,=1 and to group 2 if 
I,=0. We have n observations, of which n, belong to group 1, nz to group 2 (n,+n2=n).. 

Let us assume that the errors (Uw, Uz, Dal have a trivariate normal distribution with 
mean vector 0 and covariance matrix: 


Ou du ~ Ois 
Or Otu l- 
1 


We assume Var(u,)=1 because I* is observed only as a dichotmous indicator. Also, 
since we do not observe yu and yx for the same individual, we cannot estimate ez, 
Consider the estimation of 8, and o for the n, observation on yu. We cannot use 
the OLS method because E(u:,)#0 if we consider only the subset of n, observations. . 
Specifically, we have, using the expressions for the mean of a truncated normal, 


| E(uu|yu observed)=—oi.6,/8,, 
where Gréin Z,) and ®,=®(y’ Z,). Similarly, | 
E(uz| Yu observed) = 02.¢,/(1—®,). 
Defining W,,=¢,/, and W..=¢,/(1—®,), we can write the equations as; 
Yu=6i Xu—ouWut ews | 
Yu =i Kut Om Wat Gan 


"and estimate these by OLS after obtaining preliminary estimates of Wi, and Wau. These 
_ estimates can be obtained by estimating the indicator function by the probit method 
_ and getting an estimate of y. This is the two-stage method. The selection bias in the 
usual OLS arises from the omission of the variables W, and Wz. See Heckman (1979), 
Lee (1978), and Maddala (1983, 223-28).* If o,,=0 or W, is uncorrelated with X,,, there 
is no “selection bias” in the first equation. Similarly, if o..=0 or Wz is uncorrelated 
with X2; there is no selection bias in the second equation. Also, as discussed earlier in 
the case of the tobit model, the two-stage method has several drawbacks. The errors ex 
and e are heteroskedastic. There is also a need to correct the standard errors because 
W,, and Wz are “generated regressors.” The maximum likelihood (ML) procedure used 
by Lee and Trost (1978) avoids these two problems. The ML estimation is not compli- 
cated because we can write the joint density g(u,,u) in terms of g,(u,), the marginal 
density of u,, and the conditional density g.(u|u,). We can do this similarly for h(uz2,u). 
This manipulation is similar to the one done in the case of generalizations of the tobit 
model discussed earlier. (See p. 176 of Maddala 1983.) | 

Of greater concern is the fact that, as argued in Goldberger (1983), the selection bias 
adjustments have been found to be very sensitive to the assumption of normality that 
we have been making all along. The solution to this problem is to use more generalized 
distributions, as discussed in Lee (1983) and Maddala (1983, 272-76), or use the semi- 
parametric methods, of which the most practical appears to be the one = by 


+ eo erin Mad 89, Tho snd equations be tp of. p. 225 ` 
should have a (~—) sign of a (+). Also, in equation 8.16, the sign before Wx should be ( —}, not a (+). 
EE 
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Gallant and _Nychka (1987). But before any of these procedures can be adopted ‘in the ac- 
counting studies, it is best to review what has been accomplished i in the noe so far. 


Analysis with the Selection Model 


Suppose we have estimated: the regression equations for the two groups allowing 
for the possibility of selection bias. We would next want to see whether there are any 
significant changes in the estimates of the effect of the explanatory variables. If so, we 
would want to argue that the analysis ignoring the selection process has ge mis- 
leading results. 

We would also want to see what the prediction of y, would have been if it had come 
from group 2 and, similarly, what the prediction of yu would have been if it had come 
from group 1. This is, in fact, the main purpose in the analysis of self-selection models. 
In the case of the R&D example, we would want to know what the R&D expenditure of 
the expensers: would have been had they been capitalizers and what the R&D 
expenditure of the capitalizers would have been had they been expensers. This would 
then give us an estimate of the magnitude of the difference in the R&D expenditure for 
ell the firms due to a change in accounting between capitalizing and expensing as man- 
dated by SFAS No. 2. These computations have been added by Shehata (1991) ene 
the first draft of this article. We shall discuss these later. 


A Discussion of the Empirical Results 


' The empirical examples on self-selection in the ECH literature thus far do not 
show any strong evidence of selection bias. What they imply.is that there are significant 
and persistent parameter differences between the two groups, and these need to be 
studied in greater detail. 

The paper by Abdel-khalik (1990b) considers the case of “good news” and “bad 
news” firms. It is hard to see how this is a case of “self-selection” because firms are not 
expected to select themselves to be in these groups.® Not every classification. into two 
groups justifies the use of the self-selection model. Banker et al. (1991), consider “good 
history” and “bad history” firms but do not talk of self-selection. Again, in this case, _ 

_ firms do not choose membership in the respective groups. Their results show substan- ` 
_ tial differences in the parameter estimates for the two groups compared with those in 
Abdel-khalik (1990b). Conversely, in the case of LIFO versus FIFO, one can make a case 
`- for the self-selection model because firms have a choice of belonging to either one of the 
two groups. — 

The self-selection oer is based on the idea that individuals choose one of two 
groups on the basis of expected benefits: from belonging to the- two groups. In the 
unions and wages example of Lee (1978), the benefits were measured in wages, and the 
assumption is that each worker compares his or her potential wages in the two sectors 
and then chooses the sector with the higher wage. That is why in Lee’s analysis the in- 
dicator function I* has, as an explanatory variable, the wage differential {Wu — Wai); 
where W,, and W,, are the wages in the union and non-union sector, respectively, for 
the ith individual (one wage expected and unobserved, the other actual and mein 


e Pattell and Wolfson (1982) talk of “good news” ami “bad news” firsts thot E EE 
Ee 
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In the examples in the accounting literature, it is hard to think of the inclusion of | 
the “differential benefit” term in the choice equation because the explanatory variables 
Yu and yz often do not have the interpretation of benefits. This is also the case, for in- 
stance, with the R&D example of Shehata (1991). Sometimes the benefits can be cap- 
tured by the stock price; that is why there are studies that analyze the impact of ac- 
counting changes on stock prices, as in Lev (1979), Smith and Dyckman (1988), and 
Salatka (1989). In some cases, we can identify the benefit variable. Since it is executives 
who make the decision on accounting choices, one can think of measuring the benefits 
of the different choices by executive pay, as in Abdel-khalik (1985). 

Turning next to the empirical results in Abdel-khalik (1990b), we see that the evi- 
dence on selection bias is very weak. Comparing the results in table 1, panel B (p..151), 
which do not include the selectivity term with the results in table 5, panel A (p. 165) that 
include the selectivity term, we note that the coefficients of d}, Sen and Xz are not 
much different from each other. Of greater significance is the differences, both in mag- 
nitudes and sign, in the coefficients for the two groups (x>1 and x<1). Note that if 
there is a “selection bias,” the estimated coefficients of the model with selectivity term 
included should be significantly different from the estimated coefficients that ignores 
selectivity. If there is no difference, by definition, there is no. bias.® 

Yet another example of the self-selection model is on the jointness of audit fees and 
the demand for MAS by Abdel-khalik (1990a). The dependent variable is log of audit 
fees, and the two regimes are: 


Regime 1: Whether MAS services are provided by the incumbent audit firm. 
Regime 2: Whether MAS services are provided by others. _ 


In this case, it seems reasonable to make the self-selection argument Gem it makes 
sense to assume that firms choose the regime on the basis of the audit fees (they prefer 
to minimize audit fees). The estimation was done by the two-stage method. However, 
the sample sizes are very small (36 for regime 1, 48 for regime 2) for the application of 
self-selection analysis and the selectivity bias variable is not significant. The R? values, 
however, are much higher than in the example of “good news” and “bad news.” 
Since the paper by Shehata (1991) is an elaborate empirical study of the self-selec- 
tion problem in the accounting literature, it is worth considering it in detail. In the case 
of the SFAS No. 2 mandate that all firms should expense their R&D expenditures, the 
problem is one of estimating the effect of the mandated accounting change on R&D 
expenditures. Before the rule, firms had the choice of expensing or capitalizing their 
R&D expenditures. Shehata considers a sample of 349 firms (121 capitalizers and 228 
expensers) taken from 1973, the year preceding the SFAS No. 2, He estimates an ac- 
counting-choice equation using the probit method and then estimates by OLS two R&D 
equations for the two groups with the selection variables described earlier added as 
explanatory variables. The estimation method is, thus, the two-stage method described 
earlier, but no allowance has been made for heteroskedasticity. 
Since R&D expenditure does not have the interpretation of benefit and, hence, can- 


* One striking result in table 5 (Abdel-khalik 1990b) that uses the two-stage method of correcting for selec- . 
tivity bias is that the Breusch-Pagan test statistic for heteroskedasticity is highly significant. This suggests that 
the two-stage method is not appropriate. However, one argument against the use of the ML method is that 
l Proy the Ais dre Vary OMAN Mhp paringa EE as argued by Lev (1989). 
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Table 1 
DEEN of Selection Model and OLS Estimates 





Capitalizing Sample Expensing Sample 
Variable Selection OLS ‘Selection OLS 
TA 1.579 1.520 0.272 0.273 
(TAP — 0.577 — 0.578 ~ 0.00253 . 0.00270 
CF ° 1.838 ` 1.857 — 0.0355 0.0408 
CAP - 0.760 — 0.750 0.273 . 0.270 
DIV NEEN — 0.259 —-0.04735 ~ 0.0451 


Source: Shehata (1981), tables 4 and 5.- 


not be the criterion on the basis of which the firms “self-select,” we have to assume that 
the accounting-choice equation captures the determinants of the choice, whatever they 
are, and these need not be related to R&D. In an earlier draft, Shehata introduced R&D 
into the probit equation, but this raises some problems of logical inconsistency, as will 
be discussed in section 6. Turning to the question of whether there is “selection bias,” it 
is important to note that the estimates from OLS with and without the selectivity term 
included are very close to each other, as shown in table 1. One has to conclude from 
this that, by definition, there is no selection bias. Of greater significance is the fact that 
the coefficients of the explanatory variables are significantly different (not only in mag- 
nitude but in sign) between the two groups. 

Once we obtain estimates of 8, and 82, the parameters in the RED equations for the 
two groups, taking account of the selection bias, we can predict the R&D expenditures 
for all the firms from the two equations. These predicted values would give what the 
R&D expenditures would have been if all the firms were expensing their R&D or if all 
were capitalizing their R&D. Using the first equation and the data from the capitalizing 
firms, we estimate what their R&D expenditures would have been were they to use ex- 
pensing rather than capitalizing. Note that in this procedure, we do not include any se- 
lectivity terms. The objective in the estimation of the selection model is precisely to 
obtain estimates of 8, and 8, free of selectivity bias. But once this estimation is done 
and we have estimates of 8, and 8; free of selectivity bias, these are the parameter 
estimates we use in any further analysis, 

These calculations are presented in table 6 of Shehata. Given the minor differences 
in the coefficients noted earlier, one would not expect much difference in these esti- 
mates from the OLS and the selection models. Though the magnitudes of the differ- 
ences in Shehata’s table 6 are somewhat puzzling, both the OLS and selectivity model 
point to the same conclusion: that the average R&D expenditures of capitalizers would 
have been lower if they were to be expensers, and the average R&D expenditures of the 
expensers would have been lower were they capitalizers. 


V. Structural Change and Prediction of the 
Effects of Mandated Accounting Changes 


Consider again the example of R&D expenditures under the expensing and capital. 
izing methods of accounting. The analysis is similar for other problems involving ac- 
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counting choice, such as the full-cost (FC) versus successful efforts (SE) accounting 
methods analyzed by, for instance, Dyckman and Smith (1979), Collins et al. (1982), Lys 
(1984), and Smith and Dyckman (1988). In the example of R&D expenditure, let group 1 
be the companies using the expensing method for R&D expenditures and group 2 be the 
companies using the capitalizing method. After the mandated change, all companies 
use the expensing method. 

If tests for stability show that parameter changes have occurred only for group 2 
and not for group 1, it would indicate that only firms choosing the capitalizing method 
before the mandated change have been affected by the new law. If the tests for stability 
show that parameter changes have occurred for both groups, this is an indication of 
changes occurring with regard to R&D for all the firms. In this case, the impact of the 
mandated change on the firms in group 2 can be obtained only after deducting the 
economywide effects. A measure of the average effect of the mandated change is, there- 
fore, given by: 


average prediction | _ | average prediction 
error for group 2 error for group 1 


These prediction errors are computed with the estimated coefficients before the man- 
dated change, and the data after the mandated change. Thus, the prediction errors can 
be used both to test for structural change and to estimate the effect of the mandated 
change on the firms that now use the capitalizing accounting method. 

In the accounting literature, the effect of mandated accounting changes has often 
been estimated by using the changes in the rate of return on the securities of the firms. 
These studies use the capital-asset pricing model: 


Ry=a:+biBinss 
where: 


R,,=return on stock i in period t, and 
R,.=Teturn on the market portfolio in period t. 


This equation is estimated with weekly data prior to the issuance of the proposal man- 
dating the accounting change. Using the estimated values of a, and b,, the estimated re- 
turns for the post-enactment period are computed. These are compared with the actual 
returns and the prediction errors computed. These prediction errors for the two groups 
of firms are then compared.’ 

The problem with this methodology is that no distinction is made between the cap- 
italizing firms and expensing firms before the accounting change. The self-selection 
model captures the fact that the firms have the option of choosing one or the other— 
whichever maximizes the firm’s rate of return. Furthermore, it is important to consider 
other explanatory variables besides the market rate of return to explain the rate of re- 
turn to security i. Finally, in previous studies no statistical tests have been applied to 
check the significance of the observed effects. 

The tests for structural change by Shehata show that there have been structural 


” The effect on stock prices of voluntary accounting choice (as opposed to mandated accounting choice) as 
in LIFO vs. FIFO has been investigated in several studies. See, for example, Biddle and Ricks (1988), who 
review earlier studies. Here again the self-selection problem is ignored. 
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changes for firms in both the groups. The average prediction error for the capitalizing 
sample is lower in magnitude but opposite in sign from that of the expensing sample. 
There is, however, the problem that the number of firms considered declined over the 
period 1973-1978 from 121 to 87 for the capitalizing sample, from 228 to 196 for the ex- 
pensing sample. | 

Apart from the issue of structural changes, it is important to notice that the differ- 
ences in the coefficients between the two groups (noted in table 1) persist over the 
entire time period (1973-1978) considered, that is, even after the mandated accounting 
change. Thus, the behavior of the capitalizing sample did not converge to that of the 
expensing sample even after the mandated accounting change, as one would expect 
from the selectivity model. This suggests that something more is going on than the 
selection effects. Consider, for instance, the relationship between capital expenditures 
and R&D expenditures. The negative relationship noted for the capitalizing sample, as 
opposed to the positive relationship for the expensing sample, noted in 1973, persists 
throughout the period 1975-1978, even after the mandated accounting change. This im- 
portant result (possible industry effects) needs: further investigation. Many of these 
issues are complicated to analyze within the framework of a model of self-selection 
bias. Given that the selectivity effects are found to be not very important, it would be 
desirable to concentrate on these other more substantive issues. 


VI. The Issue of Simultaneity 


- One issue discussed in an earlier draft of Shehata (1991) is that of simultaneity be- 
tween accounting choice (expensing vs. capitalizing) and the amount of R&D expendi- 
tures. Although the present version does not include this analysis, I will use it as an 
illustration of some of the difficulties encountered in dealing with simultaneity and 
self-selection bias. How do we take simultaneity into account in the context of switch- 
ing regressing models? Shehata took care of this by introducing R&D expenditure in the 
probit equation, but this creates problems because the question is: Whose R&D are we 
talking about? Given that R&D expenditure is an endogenous variable, we cannot esti- 
mate the accounting-choice equation by the usual probit method because this gives in- 
consistent estimates. We cannot estimate it by the reduced form either: the reduced 
form is not uniquely defined because of the two different equations for the R&D 
expenditures for two groups. Thus, we have a logically inconsistent model. Issues 
related to logical consistency in limited-dependent and. qualitative variable models are 
discussed in sections 5.7 and 7.4 of Maddala (1983) and in the references cited there. 

How do we get around this problem of logical inconsistency? It would be instruc- 
tive to go back to the example studied by Lee (1978) on unions and wages. Alternative 
methods. of incorporating simultaneity and some pitfalls are discussed in Maddala 
(1983, 348-57). What Lee did, which is better than the other procedures and does not 
result in logical inconsistency, was to define two (potential) wages Wu and W, for 
union and non-union workers. They are defined, in principle, for all the workers, but 
only one of them is observed based on the observed union status. Lee also introduced 
the wage differential (W.,— W,,) as an explanatory variable in the choice equation for 
union status. This equation is then written in the reduced form and estimated by the 
probit method. Note that the reduced form is uniquely defined. Then the two wage 
equations are estimated after correcting for selection bias. Finally, the predicted wage 
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differential (W.,—W,.) is introduced in the union-choice equation, and it is estimated 
by the probit method. This is the two-stage ‘structural probit” estimation. Lee obtained 
a significant coefficient.for this variable, thus suggesting that workers make their 
- choice’ on the basis. of the wage differential. 

In the case of R&D expenditures, it does not make sense to include the differential of 
R&D between those using the two accounting choices (expensing vs. capitalizing) be- 
cause it is hard to see how this differential can affect accounting choice. R&D 
expenditures are not like profits. In Lee’s example, wages measured the benefits to 
- workers. Thus, the Lee-type formulation cannot be justified here. One way out of this 
dilemma, which is not entirely satisfactory, is to use the two R&D expenditures as the 
explanatory variables because then the accounting-choice equation can be at least esti- 
mated in its reduced form. Though this solves the logical inconsistency problem from 
the statistical point of view, it is hard to attach a meaningful interpretation to this speci- 
fication. Thus, the best course of action in this case is not to include R&D in the probit 
equation and to argue that the simultaneity is accounted for only indirectly through the 
specification of the selection model (and given results from the selection model, this 
simultaneity i is- weak). 

The preceding discussion is meant to point out the problems i in Geste with simul- 
taneity problems in the context of switching regression models, 


VII. Conclusions 


After a review of the work on nn variame models in be 
research, some conclusions are: 


1. It is important to note the relationship between the discriminan anay and 
OLS (linear probability model) on the one hand, and the logit and probit models 
on the other. For accounting studies, even in small samples, the available evi- 
dence indicates that it is preferable to use the logit or probit models rather than 
the OLS. Almost all computer packages offer this option gouw. 

2. Very often the analysis has to be performed on the basis of choice-based samples 
-~ (where the groups making the different choices are sampled at different rates). In ` 
this case it has become customary to use weighted maximum likelihood methods 
(WESML). This is unnecessary in the logit model because the usual logit esti- 
-mates (except for the constant term) and their standard errors remain valid. As 
for the probit and tobit models, it is argued that the CML (conditional maximum 
likelihood) method, which is more efficient, should be used. This procedure is no 
more difficult: computationally than the WESML. The appropriate likelihood 
function to maximize does not involve a simple weighting scheme but weights 
that depend on thé explanatory variables. , 

3. It is important to bear in mind the distinction between Berkson’s logit model and 

-  _McFadden’s conditional logit model. The former is simpler in structure, and this 
is what is relevant in accounting studies. It is also important to note the distinc- 
tion between the censored regression model (tobit model) and the truncated re- 
gression model. There is no way Heckman-type correction can be used with the 
latter model. Also, the tobit model should not be used if the observations on the 
dependent variable are missing because of individual choices rather than pure 
censoring. In this case the choice process should be modeled explicitly. This 
leads to bivariate extensions of the ordinary tobit model. — 
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4. Another model that has recently been widely used is the self-selection model. 
The self-selection model implies that there are two behavioral relations for the 
two groups (LIFO vs. FIFO, expensing vs. capitalizing of R&D, etc.) and a choice 
function that determines whether an individual belongs to one group or the 
other. Suppose an entity is currently in group 1. The implicit assumption of the 
self-selection model is that its behavior would be described by the second equa- 
tion if it were to belong to group 2. Thus, after the two relations have been cor- 
rectly estimated, one can calculate what the behavior of each firm would have 
been had it belonged to the other group (e.g., what the R&D expenditure would 
have been for a firm currently capitalizing its R&D if it had expensed its R&D). 
However, for studies in accounting choice with data available for a number of ` 
years after a mandated accounting change, an examination of the estimated 
coefficients indicates that differences in the behavioral relations persist even 
after firms in one group are placed in another group. Thus, an important as- - 
sumption of the selection model is perhaps not valid. This point needs further in- 
vestigation. 

5. The estimated methods for the self-selection models have been the two-stage 

-~ methods. These suffer from problems of heteroskedasticity and wrong standard 

_errors. Given the progress in computer technology during the past decade, it is 
better to use the ML method. There is also considerable evidence that the selec- 
tivity-bias estimates are very sensitive to the assumption of normality. Hence, 
alternative error distributions should also be tried. 

6. Of greater importance is the handling of simultaneity problems. The self- ` 
selection model is.a switching regression model. In these models, if the reduced ` 
forms are not uniquely defined, the model becomes logically inconsistent. Simul- 
taneity problems have to be formulated more carefully when’ — limited 
dependent variable models. 
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SYNOPSIS AND INTRODUCTION: The American Institute of Certified 
Public Accountants (AICPA) and its affiliated state societies promote 
restrictive accountancy laws that limit both the right to express opinions on 
financial statements and the use of certain occupational titles to licensed 
public accountants. 

Although occupational licensing, like other forms of government reg- 

_ ulation, is justified as being in the “public interest,” critics (e.g., Stigler 
1971; Peltzman 1976) suggest that licensing arises because of the profes- — 
sional groups’ interest in using the coercive power of government for their 

` own economic advantage. 

Until 1979, CPAs were content to limit state regulation to the audit 
function, permitting unlicensed accountants to perform other accounting SS 
tasks. With the growing importance of review and compilation services, 
however, CPAs have sought to restrict the performance of these services 
too. In addition, the AICPA and state CPA societies have used -their 

-1 Influence with state legislatures and licensing boards to impose limitations 
on the use of professional titles such as “public accountant,” accoun- 
tant,” and “auditor.” 

Whereas some states have adopted ‘relatively permissive icaneing 
laws, others restrict all analytical work and professional tities to licensees. 
This study explains why some states have adopted more restrictive licens- 
ing regimes than others. -Hypotheses are developed to test the power of in- 
terest groups, political geilen and socioeconomic variables in explaining 
such differences. 

The evidence, based on both univariate and multivariate techniques, 
supports the following general conclusions. Restrictive licensing regimes 
are more likely in states where the interest-group strength. of CPAs is high, 
&s measured by their numbers relative to public accountants who are not 
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CPAs. Restrictiveness is inversely related to statewide competition between 
Republicans and Democrats and is slightly related to legislative turnover. 


Key Words: Licensing, Regulation of public accountancy, Professional 
regulation, Interest group politics. 


Data Availability: 7he data are available on request. 


I. Background 


HE stated purposes of accounting licensure are to promote greater practitioner 

competence and to protect the public.! The AICPA and its affiliated state socie- 

ties promote accountancy laws that limit both the use of certain occupational 
titles and the right to express opinions on financial statements to licensed public 
accountants. The expansion of practice restrictions to include compilations and re- 
views is of particular relevance to this study. . 

Before 1979, public accountants could perform an audit and express an appro- 
priate opinion or refrain from auditing and disclaim an opinion.? Recognizing that 
the rising costs of auditing had created a need for lower cost alternatives, a committee 
of the AICPA issued the “Statement on Standards for Accounting and Review Services 
(SSARS) No. 1.” The new standard, which took effect in 1979, added two types of ser- 
vices—compilation and review. 

CPAs historically limited state regulation to the audit function, permitting non- 
CPAs to perform other accounting tasks. With the growing importance of review and 
compilation services, however, CPAs have sought to restrict the performance of these 
services. The AICPA has argued that, even for compilations, financial-statement 
readers may draw inferences regarding the accuracy of the statements because of the 
independent accountant’s association. 

The AICPA and the state CPA societies also have sought to restrict the use of the 
terms “public accountant,” “accountant,” and “auditor.” The Institute has argued that 
use of any of the aforementioned titles, even the generic term “accountant,” implies a 
level of expertise that the practitioner may not have (Young 1986). 


II. Hypotheses Development 


The diversity of licensing schemes within the United States suggests that CPAs desir- 
ing restrictive licensing have been more successful in promoting their regulatory agenda 
in some states than in others. There is a need to know why these CPAs are more effec- 
tively represented in the political process than their counterparts in other states. Under 
what conditions are CPAs likely to succeed (or fail) in obtaining favorable legislation? 


1 The American Institute of Certified Public Accountants (1981, 4) explains the public role of the profession 
in this manner: “Financial statements that have been examined by CPAs are used by the public to help maka 
critical decisions. Users of financial statements... rely upon the competence of professional accountants. 
People who invest money or grant credit on the basis of financial statements and related reports are entitled to 
assume that those accountants who are CPAs as a result of having demonstrated their competence under state 
law are differentiated from those who do not have such competence. The public cannot reasonably be expected 
to investigate the underlying qualifications of each accountant.” 

2 The formal rules governing unaudited financial statements were expressed in the AICPA’s Statement on 
Auditing Procedure (SAP) No. 23 (promulgated in 1949) and SAP No. 33 (promulgated in 1963). 


Young—Interest Group Politics 811 


Research related to other professions has identified three categories of explanatory 
variables that account for the differences among states in the degree to which profes- 
sions are regulated. These categories are: (1) the relative strength of interest groups con- 
cerned with the regulations; (2) characteristics of a state’s political system; and (3) a 
state’s socioeconomic environment (see, e.g., Begun et al. 1981; White 1982; Paul 1982). 
Testable hypotheses for each set of explanatory variables are developed below. 


Interest-Group Variables 


Research (e.g., White 1980) suggests that the size of a licensure-seeking group rela- 
tive to that of opposing groups significantly affects occupational regulation. Size alone, 
however, appears to have little explanatory power, 3 This study develops a proxy for the 
strength of CPAs relative to their principal legislative rivals—non-certified public ac- 
countants (NPAs). Interest-group strength is measured here as the ratio of the number 
of CPAs to the number of NPAs in a given state. Because NPAs are most directly af- 
fected by regulations favoring CPAs, they are likely to constitute the principal opposi- 
tion to any regulatory benefits sought by CPAs. 

Ideally, a measure of interest-group strength would match only those CPAs who 
stand to benefit from expanded regulation against those NPAs who stand to lose, Al- 
though one may reasonably argue that the overwhelming majority of NPAs have a sub- 
stantial stake in the outcome of these regulatory battles, the same cannot be said for 
CPAs. Young (1986) found that NPAs did not compete in the market for audit services, 
even in those states where they were permitted to do so. Given the nature of the services 
they perform, the NPAs’ principal competitors are practitioners in local firms to whom 
rents would accrue from licensing restrictions, In addition to the CPAs employed in 
private industry, government, or academia, practitioners working in large, national 
CPA firms do not stand to benefit from these regulatory restrictions, Hence, a substan- 
tial portion of CPAs is not expected to have an economic stake in the outcome of the 
licensing controversy.‘ 

Although: a high CPA/NPA ratio suggests relative ease in obtaining favorable regu- 
lation, it may also suggest a lesser incentive for CPAs to obtain such regulation. In the 
limit (that is, in states with no NPAs), the absence of NPAs precludes the possibility of 
an immediate competitive threat, and CPAs’ demands for broader practice and title 
restrictions would disappear. However, differences among states might encourage 
those states with relatively high CPA/NPA ratios to press for protective statutes as a 
preemptive strike against NPAs who might consider migrating from states with more 
_ restrictive licensing regimes. As a result, one would still expect states with high 
CPA/NPA ratios to be more restrictive than other states, 

The first hypothesis can then be stated as follows: 


? In White’s (1980) study of registered nurses (RNs), for example, a size variable—female RNs per 100,000 
population—was not significant in explaining why some states adopted restrictive licensing laws before 
others. However, when White used a relative size variable—the percentage of nursing personnel in general hos- 

pitals who were RNs—the coefficient was significant, with the predicted sign, at the 0.05 level. 
‘ A more precise measure of interest-group strength, rather than a state’s entire CPA population, would be 
limited to CPAs who practice in small firms and, therefore, are in direct competition with NPAs. However, ag 
few state CPA societies classify their membership according to type of practice (national, regional, etc.) such 
refined data are unavailable. Moreover, there is no empirical evidence or a priori reasoning to suspect a eg, 
tematic bias across states regarding the proportion of CPAs with sizable individual stakes in the controversy. 
For this reason, the measure described earlier is used as the proxy for interest-group strength. 


812 ; The Accounting Review, October 1991 


H1: Public accounting regulations are more likely to be restrictive in states with 
large numbers of CPAs relative to non-CPAs. 


For interest-group strength to translate into political influence, some means must 
be found to facilitate interest-group involvement in the political process. Previous re- 
search has identified professional associations as the principal mechanism for channel- 
ling interest group demands on the political process (see, e.g., Akers 1968, Begun et al. 
1981). In public accountancy, state CPA societies fulfill this function. 

Consequently, one measure of interest-group political strength is the percentage of 
CPAs belonging to state societies. This measure may also serve as a proxy for the 
degree of a state society’s aggressiveness in seeking favorable regulation. Thus, the sec- 
ond hypothesis may be stated as follows: 


H2: Public accounting regulations are more likely to be restrictive in states where a 
high percentage of CPAs belong to the state CPA society. ` 


Political-System Variables 8 


When SSARS No. 1 was adopted by the AICPA, state CPA societies throughout the 
United States sought to expand the scope of regulated practice to include compilations 
and reviews. In states imposing such restrictions, state accountancy boards have either 
enacted regulations restricting all analytical work to licensees, or reinterpreted earlier 
licensing laws to include compilations and reviews. In most states, however, the new 
practice restrictions were imposed by new, post-SSARS statutes, which raises the possi- 
bility that the nature of a state’s political system influences the ability of professional 
groups to obtain favorable regulation. Even if the benefits of licensing restrictions were 
uniform across the states, characteristics of the states’ political systems could enable 
CPA societies to influence the political process at lower cost in some states than in 
others. This possibility is explored using two political-system variables. 

Interparty competition (IC) indicates the relative balance between Democratic and 
Republican control of the state legislature. IC is measured as the ratio of the number of 
offices held by the dominant party to the number held by the other. A value of “1” rep- 
resents perfectly balanced competition since the number of Democrats would equal the 
number of Republicans. The political science literature (see, e.g., Dawson and Robin- 
son 1963) suggests that, as IC approaches unity, the competing parties seek to expand 
their share of votes by broadening the diversity of interests to which they appeal. Since 
greater competition expands the scope of conflict within the political system, interest- 
group power would be checked by the need of the political parties to broaden popular 
support for their candidates and policies. Accordingly, IC is expected to be inversely 
related to the degree of occupational regulation, which leads to the following hy- 
pothesis: 


H3: Public accounting regulations are more likely to be restrictive in states with 
low interparty competition. 


The turnover rate of state legislators (TO), the second political-system measure, is 
used because legislative turnover affects the way a state legislature conducts its busi- 
ness (e.g., development of a seniority system, committee structure, committee assign- 
ments, and legislative autonomy relative to other branches of government). Further- 
more, legislative careerism (as reflected by career-oriented politicians) is related to the 
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growth of expertise on policy issues (Polsby 1968). Although one may reasonably expect 
turnover to be high in states with high interparty competition (that is, in states where 
political conflict is high), legislative turnover is also influenced by other factors such as 
legislators’ compensation, the power of the legislature relative to other agencies of 
state, and the reputational benefits of holding state office. A value of “1” for this vari- 
able represents complete turnover in the state legislature, and a value of “o” represents 
no turnover De, all incumbents were re-elected). 

Begun et al. (1981, 240) explain: “As turnover increases, the regulation-seeking oc- 
cupation should obtain more of what it wants from a legislature ill-equipped to chal- 
lenge its arguments on substantive grounds.” Turnover should have a negative influ- 
ence on the “professionalism” of a state legislature and should, therefore, improve a 
profession’s success in acquiring regulation. Thus a fourth hypothesis is: 


H4: Public accounting regulations are more likely to be restrictive in states with a 
high turnover of state legislators. 


The Socioeconomic Environment 


The justification for licensure comes from the notion that occupational service mar- 
kets are characterized by information asymmetry. Presumably, without a regulatory 
apparatus to protect them, consumers might fall prey to frauds and incompetents. Al- 
though unregulated markets may generate information that differentiates service pro- 
viders along the quality dimension, and thereby mitigate the asymmetry problem, in- 
formation search can be costly for consumers. 

Licensing reduces information-search costs for those consumers who demand a 
certain level of service quality. By imposing high entry standards, licensing laws reduce 
the supply of professional services, thereby causing the market to clear at a higher 
price. In effect, the cost of higher standards would be distributed throughout the state 
in the form of higher prices. The affluent consumers who can afford these prices would 
be better off because the standards provide them with a lower bound for the quality of 
services they purchase. 

If the demand for quality is, at least partly, a function of the consumer’s ability to 
pay for it, one would expect to observe more restrictive licensing laws in states with rel- 
atively high incomes (Leffler 1978). Therefore, one can hypothesize that: 


H5: Public accounting regulations are more likely to be restrictive in states with 
relatively high per capita income. 


TIL. The Data 


The political controversy in accounting licensure centers on (1) the extension of 
practice regulations to reviews and compilations, and (2) occupational title restrictions. 
In 27 states, CPAs have succeeded in excluding unlicensed accountants from perform- 
ing either reviews or compilations; 29 states restrict the use of the term “accountant” to 
those public accountants who are licensed by the state. The intersection of these two 
groups is composed of 22 states. These 22 states are defined as “restrictive”; the other 
28 states as “partially or not restrictive.” (In light of the imprecise nature of the term 


* Restrictiveness of a state’s licensing regime is defined as of December 1985. 
N 
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“restrictive,” alternative definitions are considered later. ) The data indicate no appar- 
ent regional or population bias; that is, states in one region or with a certain population ` 
level are.no more likely to be restrictive than other states. 

The study sample consists of the.41 states for which data were available for all the 
explanatory variables described in the preceding section. Of these 41 states, 20 are re- 
strictive and 21 are partially or not restrictive. Information on the number of CPAs in 
each state was obtained for 1985 from either the National Association of State Boards of 
Accountancy or the individual state board. A count of CPAs was unavailable for six 
states.° The number of NPAs in each state was EE from the 1979 Yearbook of the 
NSPA. ` 
Given the predictive nature of the na, a more Epone measure for. CPAs 
than the number in 1985 would be the number in each state when SSARS No. 1 was 
adopted (1979), but data for that year are unavailable for most states.” The percentage of 
CPAs belonging to their state societies was formed by dividing the number of CPAs in 
the state by the number belonging to the state CPA society. Data on state society mem- 
bership were compiled by the AICPA in a report dated July 1985. Again, measures 
based on the 1985 data were used to proxy for 1979 data. 

_ Data for the political-system variables—turnover and interparty DEEN 
taken from the 1980-1981 and 1982-1983 editions of the Book of States, published by 
the Council of State Governments. To account for the possibility that turnover is influ- 
enced by whether or not an election for state office occurs at the same time as a presi- 
dential election, the turnover measure is based on both the 1978 and 1980 elections. 
Similarly, the interparty competition measure is baséd on the composition of the state 
legislatures, by political party, as of January 1979 and January 1981. Turnover data 
were unavailable for four states. IC data were unavailable for Nebraska because its 
state legislature is officially nonpartisan. _ oo * 

Finally, the income variable was based on per capita personal income for each state 
In 1979 as it appears in the Survey of Current Business, published by the Bureau of 
Economic Analysis, U.S. Department of Commerce, 


| IV. Results 
Univariate Tests 


The sample falls into two distinct sets—20 states with fully restrictive licensing sys- 
_ tems and 21 states with less restrictive or permissive systems. Table 1 provides sum- 
. Inary statistics for each sample. Comparisons between the restrictive and nonrestric- 
tive states indicate that the former have significantly higher ratios of CPAs to NPAs. 
. There is practically no difference between the two samples in terms of the percentage 
Ge CPAs belonging to the state’ CPA society. The explanatory power of the political- 
system variables is limited by the ability of some state CPA societies to obtain favorable 
E through accountancy board rulings, which allows them to bypass the state 


S Thake Gëtt Connecticut: Maryland, Minnesota, Nebraska, Rhode Island, and Tennessee. 

7 An implicit assumption in a CPA/NPA variable using 1985 data for CPAs and 1979 data for NPAs is that 
the number of CPAs grew at the same rate for each state from 1979 to 1985. To the extent that economic growth 
rates differed across the states over that period, one may also observe different oe rates in the number of 
CPAs, thus creating an error in variables problem. 

' These include: Alabama, Louisiana, EE and Mississippi, 
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Table 1 
Profile Analysis of Sample States, Mean (Standard Deviation) 











Probabilities” 
Non- j 
Variable Restrictive Restrictive Mann-Whitney ` Student’s t 
CPA/NPA 27.16 17.51 0.001 0.001 
(11.19) (6.05) 
%SSOC 0.650 0.653 0.525 0.532 
(0.124) (0.142) 
IC 0.453 0.647 0.007 0.008 
(0.242) (0.228) 
TOV 0.255 0.247 0.266 0.321 
(0.061) (0.053) 
INCOME ` 8.311 8.663 0.828 0.854 
(1.087) (1.019) 
n 20 21 





* One-tailed test. 


legislature. Nevertheless, interparty competition differs significantly between the two 
samples.’ Consistent with expectations, the average turnover ratio is higher for restric- 
tive states than for the nonrestrictive sample, but the difference is not significant. Al- 
though the direction of the difference in per capita income is contrary to expectations, 
the difference is also not significant. !® " 


Multivariate Tests 


The impact of each of the explanatory variables was assessed using Probit, where the 
dependent variable is a 0-1 dummy variable representing the restrictive status of a 
state’s licensing regime (0=nonrestrictive, 1= restrictive). Table 2 reports the results of 
four regressions, one for each of four alternate definitions of “‘restrictiveness.’’ Model 1 
represents the definition discussed earlier, where the restrictiveness designation is 


$ The mean value of the IC variable for restrictive states which adopted practice restrictions through legis- 
lation is smaller than for the restrictive states which adopted restrictions through other means. Moreover, the 
average TO ratio is higher for the former group. These data indicate that the tests reported here are biased 
against rejecting the null hypotheses (of no relationship between restrictiveness and the IC or TO variables). 

10 Correlation coefficients between the independent variables indicate the absence of multicollinearity. 
The only significant coefficient is between IC and INCOME (p<0.05). This correlation arises from the ten- 
dency of southern states, which have relatively low per capita incomes, to exhibit low interparty competition. 
Low competition results from the decades-old dominance of Democrats in that region of the country. 

1 The absence of nine states from the final sample raises the possibility that the reported results would 
differ if data on all independent variables were available for all states. Accordingly, the univariate tests were 
reexamined using all available data. (The tests used data from 44 states for CPA/NPA and %SSOC, 46 states for 
TOV, 48 states for IC, and 50 states for INCOME.) With the exception of the interparty competition variable, the 
results are substantially the same as those reported in table 1. Several nonrestrictive southern states were de- 
leted from the final sample; because southern states have relatively little interparty competition, their exclu- 
sion overstates the power of the IC variable. Nevertheless, when univariate tests for IC were run for 49 states 
(every state but Nebraska), the difference in means between restrictive and nonrestrictive states was still sig- 
nificant: at the 0.0606 level using the Mann-Whitney test statistic, and at the 0.0654 level using a paired t-test. 
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Table 2 


Probit Analysis of the Relation Between Restrictiveness of Regulation and Interest-Group, 
Political-System, and Socioeconomic Variables (n= 41) 


Coefficients (t-statistics) 
Explanatory Predicted a 
Variables Sign Modal 1 Model 2 Model 3 Model 4* 
Constant — 2.722 — 3.651 0,784 — 1.366 
(—0.984) (~1.316) - (— 0.288) (—0.512) 
CPAINPA + 0.108 0.110 0.075 8.088 
(2.891}° (2.924)* (2.125) (2.536)* 
%SSOC + 0.247 0.371 1.017 1.177 
(0.126) (0.189) (0.533) (0.630) 
IC _ — 2.467 — 2.783 ~~ 2.911 — 2.180 
(-—2.065} (—2.289)* (— 2.519)" (—1.968)° 
TOV + 7.562 8.623 §.373 3.234 
(1.614): (1.826) (1.199) (0.741) 
INCOME + — 0.00004 0.00006 — 0.00009 — 0.00007 
Pseudo R? 0.441 0.468 0.342 0.287 
x? 26.26 25.42 19.55 19.25 
Percent correctly 
classified . 75.61 80.48 76.61 73.17 


* Significant at the 0.01 level (one-tailed). 
* Significant at the 0.05 level (one-tailed). 
€ Significant at the 0.10 level (one-tailed). 
* Definitions of Restrictiveness: 
Model 1: Complete practice restrictions + title restrictions (TR) 
Model 2: Restrictions on audit and review+TR 
Model 3: Restrictions on audit+TR 
Model 4: TR only 


limited to states with both title restrictions and practice restrictions on all forms of ana- 
lytical work. Model 2 adjusts the definition to include states that-are restrictive in every 
sense except that they permit NPAs to perform compilations. Model 3 expands the def- 
inition still further by including states that also permit NPAs to perform reviews. Model 
4 ignores practice restrictions altogether and defines restrictiveness solely in terms of 
title restrictions. l 
The maximum likelihood estimates of the probit regressions indicate that all the 
signs are in the predicted direction, with the exception of the INCOME variable in 
three of the four models.” The coefficients for the CPA/NPA and IC variables are sig- 
nificant at conventional levels; the TOV variable is significant for Models 1 and 2. The 
%SSOC coefficient has the predicted sign for all four models but is not significant. The 
insignificance of the INCOME variable is not surprising in light of the anecdotal evi- 


Based on conventional regression diagnostics, the residuals for each of the four regressions are shown to 
be approximately normally distributed, uncorrelated with the independent variables or with each other, and 
not significantly heteroskedastic. 
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dence, made available to the author, indicating limited involvement by accounting ser- 
vice users in the licensing controversy. 

As a further test of predictive ability, the above procedure was repeated 41 times, 
for each model, by deleting a sample state each time. The model estimates based on the 
40 included states were then used to predict whether the deleted state was restrictive or 
nonrestrictive, thereby creating an out-of-sample procedure. Twenty-nine states (70.73 
percent) were predicted correctly for each of the first three models, and 26 states (63.41 
percent) for Model 4. Thus, the predictive ability using this procedure is slightly less 
than that reported for the estimated model in table 2. 


V. Concluding Comments 


This paper argues that restrictive licensing regimes in public accountancy have 
evolved as a function of interest-group strength, political-system variables, and socio- 
economic characteristics. The evidence supports the following general conclusions. 
Restrictive licensing regimes, with restrictiveness defined in a number of ways, are 
more likely in states where the interest-group strength of CPAs is high (as measured by 
their numbers relative to non-CPAs). Restrictiveness is also inversely related to state- 
wide interparty competition between Democrats and Republicans, and is slightly re- 
lated to legislative turnover. Finally, broad socioeconomic variables, such as personal 
income, are not found to be important determinants of restrictiveness. 
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most instances, cholces among accounting practices have no direct cash 
flow consequences, but changes in R&D spending to satisfy current-period 
income objectives do alter cash flows. The evidence is consistent with 
- assertions that U.S. manufacturing firms are not competitive internationally 

in part because U.S. managers are overly concerned about how their R&D 
a decisions affect current-period earnings (see, e.g:, Markoff 
1990). ` 

We analyze capital spending to determine whether these R&D 
reductions reflect differences in investment opportunities or incentives 
rather than differences in accounting treatment. Unlike the results for 
R&D, differences in capitalized costs among firms In the sample are not 
statistically significant. e 

Ve also consider whether the results can be attributed to accounting- 
based compensation arrangements. For this purpose, -we examine a sample 
of firms with no accounting-based management compensation contracts in 
effect during the test period. Results are comparable to those obtained for 
the entire sample, which Is inconsistent with an explanation that the 
observed differences in R&D spending are attributable only to explicit 
compensation arrangements. 

Our results are consistent with conclusions that compliance with 
SFAS No. 2 (FASB 1974) discouraged investment in R&D (Elliott ‘et. ai, 

. 1984; Horwitz and Kolodny 1980). That is, the evidence suggests that 

managers are more likely to consider current-period income effects when 
making R&D decisions than when making capital-spending decisions, 
whose costs are amortized over a number of accounting periods. 


Key Words: Research and EE ngome eg investment 
‘decisions. | 


Data Availabllity: 7988 COMPUSTAT data Ze, 1984 proxy. statements on 
file with the Securities and Exchange Commision.. 


L R&D as a Discretionary Investment 


ONSIDER a manager who chooses among ‘R&D investment gt and 
C who benefits (e.g., receives a bonus) when the firm’s reported earnings exceed a 
- known income objective.? Assume that the manager knows the other components 
of pre-tax earnings before making the investment decision. In conformance with SFAS 
No. 2, the amount invested in R&D is to be charged to the current period’s income. 
Table 1 shows three mutually exclusive cases that are distinguished by how the in- 
vestment decision affects the ability to report earnings greater than the income objec- 
_tive..For case 1, the difference between the anticipated earnings before R&D and the ` 
target income exceeds the cost of acceptable R&D investment opportunities. Thus, a 
decision to accept all available opportunities does not compromise the manager’s.abil- 
ity to achieve the income objective. For case 3, the anticipated earnings before R&D are 


2 Benefits to the manager can be actual or ery pecuniary or ge and. although they are 
presumed to depend on current accounting income, they may be realized in future eae 
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Table 1 


. Effect on R&D when Concern about Reporting Income that Exceeds - 
Income Objectives Influences Investment Decisions 


Conditions 


Case 1 


Accepting or rejecting existing R&D opportunities 
yields current-period income that exceeds the 
income objective 


Case 2 


Accepting all existing R&D opportunities yields 
current-period income less than the income 
objective, but rejecting all opportunities yields 
income greater than the objective 


Case 3 


Accepting or rejecting existing R&D opportunities 
yields current-period income less than the income 
objective 


Predicted Behavior 


Management accepts positive net present-value 
opportunities since doing so does not compromise 
the income objective 


Management rejects some positive net present- 
value opportunities since accepting all proposals 
compromises the income objective 


Management accepts positive net present-value 
opportunities since income is less than the 
objective irrespective of the investment decision 


less than the income objective. Again, funding all R&D investments does not affect the 
ability to achieve the income objective because reported earnings are lower than the 
target irrespective of the investment decision. 

In contrast, for case 2, the ability to report earnings greater than the income objec- - 
tive turns on the R&D investment decision. Here, anticipated earnings before R&D 
exceed target income by an amount smaller than is needed to fund all investment 
opportunities. Thus, if managers benefit from reporting earnings greater than the 
income objective, we would expect lower R&D spending for case 2 than for either case 
1 or case 3. 

The empirical tests that follow investigate whether relative R&D investment is less 
for case 2 than for the case 1 or case 3 scenarios. 


II. Empirical Constructs and Data 


Identifying case 2 observations requires knowing the firm’s investment opportunity 
sets and income objectives, both of which are unobservable. In this study, we use the 
prior year’s R&D to approximate the current year’s investment opportunities.” We also 
presume that managers seek to report earnings increases and avoid reporting losses. 
Because of the difficulties in specifying effective tax rates and the tax consequences of 
R&D investment, the tests focus on pre-tax income, although the after-tax income may 
be of greater concern to firm managers.‘ 

7 Tests using mean R&D expense for the prior three years to approximate available opportunities are simi- 


lar; therefore, we report results only for analyses that use the prior year’s R&D expense to approximate current 
investment opportunities. 

* To the extent that concern about income derives from compensation, pre-tax income may be the more eap- 
propriate measure for the analysis. Specifically, Healy (1985) notes that accounting-based management com- 
pensation tends to be based on pre-tax, rather than after-tax, income, 
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To illustrate how we classify the observations, suppose the objective is to report 
positive earnings. When income before tax and R&D exceeds the prior year’s R&D 
expense, the observations are classified under case 1. Observations are assigned to case 
2 if income before tax and R&D is positive but less than the prior year’s R&D expense. 
Finally, case 3 consists of observations with negative income before tax and R&D. A 
second classification of observations is similarly constructed but uses the prior year’s 
pre-tax income as the income objective. 

There is a positive correlation between R&D spending and reported income, and we 
therefore expect greater relative R&D spending for case 1 observations, even when 
firms do not alter spending to achieve current-period income objectives. Hence, com- 
parisons between case 2 and case 3 firms potentially offer more persuasive evidence 
that concern about reporting positive income or increasing levels of income is related 
to decisions to invest in R&D. That is, greater spending for case 3 than for case 2 cannot 
be attributed to resource constraints, nor can it be attributed to a direct relation be- 
tween financial performance and investment opportunities. 

The sample comprises industrial firms (SIC 20-39) from the 1988 COMPUSTAT 
data file that have R&D expense greater than 1 percent of sales for each of the 13 years 
1975-1987. To ensure predictable income consequences of altering R&D spending, we 
excluded firms with government-sponsored R&D. Observations prior to 1975, when 
SFAS No. 2 reporting requirements for R&D spending went into effect, were not used, 
We further restricted the sample to firms with the data required to compute the 
variables used in the analysis for all years 1975-1987. These criteria yield 438 firms. 
We used the first two years of data to compute measures required by the analysis, Thus, 
the final sample consists of 4,818 firm-year observations (438 firms times 11 years, 
1977-1987). 


HI. Results 
Univariate Comparisons 


Table 2 displays descriptive statistics for relative R&D spending defined as the ratio 
of current- to prior-period spending. Consistent with expectations, mean relative R&D 
spending is lower for case 2 than for cases 1 and 3, but the comparisons are sometimes 
inconsistent with these expectations at the quartiles of the distributions. For both clas- 
sifications, the difference between the means of cases 2 and 3 is statistically significant 
Lo A 0.10). However, because mean relative R&D spending is not constant across the 
years in the sample and because the number of case 2 and case 3 firms are not distrib- 
uted uniformly over the years, these results by themselves are inadequate for drawing 
inferences. 


Multivariate Specifications 
The general form for the regression specifications is: 
r=X6+Dy+uh, 


where r is an nx 1 vector of observations on R&D spending; X is an nx m matrix of m 
covariates (the mx1 vector 8 gives the related parameters); D is an nxk matrix of 
dummy variables that distinguish case 1 and 3 observations from case 2 observations 
(the kx1 vector y gives the related parameters); and y is an nx 1 vector of residuals. 
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Table 2 


Comparisons of Relative R&D Expense for Observations Classified 
According to How Income Before Taxes and R&D Compares with 
Zero and with the Prior Year’s Income Before Tax 





Dependent Variable: Current-Period R&D Expense Deflated by the Prior Year’s Expense, 1977~1987 


Number of Standard First Third 
Firm Classification Observations Mean Deviation Quartile Median Quartile 





Entire Sample 4,818 1.176 0.350 1.023 1.135 1.272 


Income Objective is to Avoid Losses 


Case i 4,145 1.204 0.328 1.050 1.150 1.284 
Case 2 262 0.960 0.279 0.786 0.957 1.116 
Case 3 411 1.032 0.495 0.797 0.981 1.155 


Comparison of means, case 2 versus case 3: t=2.121 (p=0.017) 


Income Objective is to Report Earnings Increases 


Case 1 3,285 1.212 0.353 1.055 1.160 1.297 
Case 2 761 1.087 0.266 0.976 1.085 1.191 
Case 3 | 772 1.111 0.386 0.940 1070 ` 1.220 


Comparison of means, case 2 versus case 3: t=1.398 (p=0.082) 


Various measures of R&D are considered as the dependent variable. The reported 
results define r as current R&D deflated by lagged R&D. Note that using this measure is 
equivalent to using percent changes in R&D spending.° 

Covariates X are introduced to control for cross-sectional and time-variant factors 
that potentially influence R&D spending. Lagged R&D spending (defined as above) and 
current and lagged revenues, appropriately deflated, are included to obtain the 
reported results. We are not directly concerned about the effects of the covariates, thus 
we do not report the estimates of 8.° 


$ Results are comparable when R&D is specified as current spending deflated by the lagged moving average 
for the prior three years. A referee suggested the use of R&D deflated by current revenues. Results are similar 
for this specification of levels, although the estimates 3, are much less significant. One plausible reason why 
these results are less significant is that the sales deflator is highly correlated with income, which is the basis for 
classifying the observations as case 1, 2, or 3. That is, when sales are relatively high (low}, and the observation 
most likely to be classified as case 1 (3), the ratio of R&D to sales is low (high). Unless R&D is a pure variable cost, 
this relation occurs by construction. This is not a problem for the reported results because lagged R&D is uncor- 
related with the classifications. 

$ Existing studies suggest that R&D investment depends in part on the industry and the firm’s competitive 
position within the industry (e.g., Grabowski 1968; Mansfield 1980) and on the availability of funds (e.g., 
Branch 1974; Erikson and Jacobson 1989; Grabowski and Mueller 1978}. Including the prior year’s spending 
and the current and prior year’s sales attempts to control for the effects of these factors. Furthermore, factors 
such as the prevailing interest rates, which influence all investment decisions, and tax-related incentives can 
vary from year to year. Including slope and intercept fiscal-year dummy variables to accommodate time-depen- 
dent structural changes in the estimates £ does not alter the conclusions. More generally, the significance of the 
estimates +, are robust for alternative specifications of the covariates, although the covariates are jointly signif- 
icant for all specifications. 
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Four dummy variables are used in the primary specification (that is, D is nx 4). 
Specifically, 


1, when income before tax and research and development exceeds lagged 
D,= R&D, 
0, otherwise; 


Da a when income before tax and R&D is negative, 
™ (0, otherwise; 


1, when income before tax and R&D exceeds the prior year’s income before 
D= tax by an amount greater than lagged R&D, 
0, otherwise; and 


1, when income before tax and R&D is less than the prior year’s income 
D,= before tax, l 
L0, otherwise. 


Note that two sets of dummy variables distinguish case 1 observations (D, when the 
income objective is to avoid losses, D; when the objective is to report increases) and 
case 3 observations (D, when the objective is to avoid losses, D, when the objective is to 
report increases) from case 2 observations. The corresponding estimates +, indicate 
differences between the mean relative R&D expense for observations in the specified 
classification versus the case 2 observations. That is, when the focus is the effect of 
concern for reporting positive accounting income, a indicates the difference in spend- 
ing for case 1 versus case 2 comparisons and 7, indicates the difference for case 3 
versus case 2 comparisons. Similarly, when the focus is the effect of concern for report- 
ing income increases, f, indicates the spending difference for case 1 versus case 2 
comparisons and A, indicates the difference for case 3 versus case 2 comparisons.’ 

If managers are specifically concerned about reporting positive income or increas- 
ing accounting income, then we expect positive estimates +, for all j. Specifically, we 
test the null hypotheses Ho: 7,<0 versus H: 7.>0. Rejecting the null in favor of the 
alternative hypothesis is consistent with assertions that firms reduce their R&D spend- 
ing to report favorable trends in accounting net income, especially for 7, and 7,, which 
distinguish the critical case 2 versus case 3 comparisons.® 


Results for Multivariate Tests 


The results in table 3 support the hypothesis that relative R&D is lower for case 2 
observations. The positive and statistically significant estimates A, and 3, indicate that 


? Seventy-two observations are specified as case 2 for both classifications. These represent 26.5 percent of 
the case 2 observations when the objective is to avoid losses and 9.5 percent of the case 2 observations when the 
objective is to report earnings increases. The reported results are for a specification that considers the classifi- 
cations ee although we obtain statistically significant estimates +, when the classifications are considered 
separately. 

* Most R&D investments entail commitments that extend beyond the current year, such that decisions to 
accept (reject) proposals may lead to greater (less) expense in future periods as well as in the current period. If 
proposals are deferred to periods when spending does not compromise the ability to report positive or increas- 
ing income, then we expect greater relative expense in years that follow case 2 observations. Whether or how 
such effects manifest themselves depends on whether case 2 firms actually reject or simply defer R&D 
spending to report favorable income trends. Investigating such effects is beyond the objective of the study, 
but we consider whether such effects influence the primary results by distinguishing observations that follow 
case 2 observations. Estimates for A are comparable for these specifications. 
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Table 3 


Estimates for y, (x 107) to Compare Mean Relative R&D Spending for 
Cases 1 and 3 Observations with Case 2 Observations 





income Objective 
Report Income 
Avold Losses. _ Increases 
Case 1 
Income before tax and R&D exceeds income e? us 18.667 KE RN 
objective by more than the prior year’s R&D t= 8.711 t= 3.435 
; p< 0.001 p<0.001 
n=4,145 n=3,285 
Case 3 
Income before tax and R&D is less than the ¥2= 7.873 ¥s=9.134 
income objective t=3.042 t=5.128 
p<0.001 p<0.001 
n=411 n=772 
Number of case 2 observations 262 761 
Mean dependent variable = 1.178 
n=4,818 
Adjusted R’=0.181 
Model F-statistic = 132.9 
Specification: 
) 4 
AO sa ge gS ERROR 
RDEXP mi.: RDEAP,.2 ; RDEXP,4,; RDEXP..2, i Ke) 


Note: RDEXP, , is the R&D expense and SALES, , is the revenue for firm i in year t; D, are dummy variables 
- representing the cases described above. 


R&D is higher for case 3 than for case 2 even though firms in case 2 have lower (relative) 
income.’ Furthermore, differences are nontrivial, as*mean relative spending is 
¥2= 0.0787 divided by 0.960 (mean relative spending for case 2 firms from table 2), or 
about 8.2 percent for case 3 when the income objective is to report positive income; 
o, 0.0913 divided by 1.087 yields about 8.4 percent greater spending for case 3 when 
the objective is to report income increases. 

Table 4 focuses on relative spending by case 2 firms. This analysis indicates that in 
49 cases reductions in spending change pre-tax income from negative to positive, and 
in 55 cases reductions in R&D yield income increases. Further analysis indicates that 


* Results are comparable when R&D spending is constrained to be no greater than twice and no less than 
half of the prior year’s spending. Thus, the results do not reflect the influence:of extreme observations. To 
examine the possibility that financially distressed firms may somehow respond differently from profitable 
firms (i.e., change their business strategies), we considered a sample that omitted firms with more than one 
case 3 observation defined around zero during 1977~1987. We also tested a sample that eliminated firms with 
no case 2 observations during the period. Results for these and other such procedures are comparable to those 
reported in table 3; that is, all estimates + are statistically significant, typically at œ < 0.001. Finally OLS resid- 
uals are heteroskedastic. Thus, the reported results are for a weighted least squares specification, witi 
weights computed by the White (1980) procedure. 
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Table 4 
Analysis of Increases and Decreases in Relative Spending for Case 2 Observations 





Income Objective 
Report Income 
Avoid Losses Increases 
Case 2 firms that: | ) 

Decrease R&D spending 151 228 
Increase R&D spending 111 533 
Report pre-tax income greater than the income l 

objective 49 55 
Total case 2 observations ` "299 0—CO 761 


the classifications explain only 5.2 percent of R&D spending. Thus, although the results 
in table 3 indicate that case 2 firms commit relatively less to R&D than case 3 firms, the 
evidence does not suggest that concern about reported income dominates R&D spend- 
ing decisions. 
In table 5, case 1 and 3 observations are further partitioned on the basis of proxim- 
ity to the presumed income objective. This delineation permits comparing mean rela- 
tive R&D spending for case 2 with those case 1 observations that are well above this 
critical range (category a), with case 1 and 3 observations that are close to this critical 
range (categories b and c), and with case 3 observations that are well below the critical 
range (category d). As before, positive (negative) estimates of the coefficients indicate 
that the mean relative spending is greater (less) for the observations in the SES 
being examined than for the.case 2 observations. ` ` 
The most compelling result in table 5 is the positive and highly significant SS 
for category d (observations that are well below the income objectives), which indicates 
that R&D for these observations is greater than for case 2 observations. This result sug- 
gests that managers take actions that reduce current income but increase future income 
when they are least likely to achieve their current-period income EES (Healy 
1985; McNichols and Wilson 1989). 


Results for Capital Investment 


A competing explanation for the results is that lower relative expenditures. for case 
2 firms reflect unspecified differences in investment opportunity sets, incentives to 
invest, or attitudes toward risk (e.g., financially distressed case 3 firms- may accept high- 
risk. projects to improve their performance) and.these factors may be correlated with 
the classifications. To investigate this possibility, we consider tangible capital invest- — 
ment. Unlike R&D, these investments are capitalized and have little effect on current 
- income. Thus, managers can more effectively influence current reported income by 
altering R&D rather than capital investment. If the observed differences in relative R&D 
are attributable to differences in investment incentives, then we expect to observe com- 
parable differences in capital spending. 
_. Except for missing observations and the use of capital expenditures to compute the. 
dependent variable and the covariates, the specification in table 6 corresponds to that 
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Table 8 


Estimates for y, (x 10*) for Finer Classifications of the Observations 





Income Objective 


Report Income 
Income Before Tax and R&D Avoid Losses Increases 
(a) Exceeds the income objective by more than Fia = 22.762 Tu= 6.979 
twice the prior year’s R&D t= 10.510 t= 4,681 
p< 0.001 p<0.001 
Case 1 n= 3,593 n= 1,605 
(b) Exceeds the income objective by more than us 10.903 y= ~ 1.060 
the prior year’s R&D but less than twice the t= 4.542 t= —0.741 
prior year’s R&D p< 0.001 p= 0.229 
n= 552 n= 1,680 
(c) Is less than the income objective but exceeds the Yu. = 5.018 Yee = 5.844 
income objective minus the prior year’s R&D t= 1.573 t= 2.817 
p= 0.058 p=0.002 
Case 3 n= 164 n= 356 
(d) Is less than the income objective minus the $u = 6.395 Gass 14.159 
prior year’s R&D t= 2.180 t= 6.406 
| p=0.018 p< 0.001 
n= 247 n=416 
Mean dependent value= 1.176 
n=4,818 
Adjusted R?=:0.180 
F-statistic = 96.814 
Specification: 
8 . 
E gg SO i So ERROR 
RDEXP,.; , RDEXP,.. ; RDEXP,:, 4 RDEXP,.; , kel 





Note: RDEXP, , is the R&D expense and SALES, is the revenue for firm i in year t; D, are dummy variables 
representing the cases described above. 


in table 3. The results differ, however, in that comparisons of case 2 versus case 3 capi- 
tal investment Dr, and +.) are not statistically significant." The results suggest that 
poorly performing firms commit relatively less to capital investment but not signifi- 
cantly less for case 2 versus case 3. This contrast suggests that the R&D results in table 3 


19 Observations are eliminated when either the current or prior year’s capital spending is zero. Also, to 
mitigate effects of extreme observations, 34 cases (out of 4,805) with capital spending more than ten times the 
prior year’s spending are constrained to equal ten times the prior year’s spending. The statistical significances 
of the estimates when relative spending is not so constrained are comparable to those reported in table 6, al- 
though the estimates for D, and D, (D; and D,) are much larger (smaller). To illustrate the effect of extreme obser- 
vations, constraining relative spending to be ten times the prior year reduces mean relative capital investment 
(the dependent variable) for all observations from 1.49 to 1.37. Finally, results using capital spending net of dis- 
posais are comparable to these reported. 
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Table 6 
Estimates for y, (x 10°) for Relative Capitalized Investment 
Income Objective | 
Report Income 
Avoid Losses Increases 
Case 1 
Income before tax and R&D exceeds income 41= 37.899 nz 16.794 
- objective by more than the prior year’s R&D t= 5.521 - t= 4.065 
p< 0.001 p< 0.001 
n=4,150 n=3,243 
Case 3 
Income belts tax and R&D is less than the 42= 3.383 «= 2.904 
income objective t=0.400 . t=0.516 
p=0.345 p=0.303 
n=396 n=754 
Number of case 2 observations 258 807 
Mean dependent variable=1.374 
n= 4,804 
Adjusted R?=0.220 
Model F-statistic = 194.85 
Specification: 
Ee SALES. me > a, + ERROR 
CAPXP,-1,; CAPXP..2; CAPXP,..., CAPXP., i kel 


Note: CAPXP, , is the capitalized investment and SALES, is the revenue for firm i and year t; D, are dummy 
variables representing the cases described above. 


cannot be attributed to differences in unspecified factors correlated with the classifica- 
tion. 

In addition, the contrast between tables 3 and 6 is Seng for evaluating the conse- 
quences of SFAS No. 2, which eliminated the capitalization of R&D. In particular, these 
results are consistent with conclusions advanced in prior studies that this requirement 
reduced incentives for some firms to invest in R&D (Elliott et al. 1984; Horwitz and 
Kolodny 1980). 


Results for a Sample of Firms Without Income-Based Bee Plans 


Healy (1985) argues that managers manipulate accounting practices to achieve the 
income objectives specified in their compensation arrangements. This raises the possi- 
bility that the results in table 3 reflect attempts to increase current pecuniary compen- 
sation. To investigate this possibility, we identify 43 firms that report no income-based 
compensation plans in proxy statements filed with the Securities and Exchange Com- 
mission."! 


" Information is from 1984 proxy statements when they are available. Otherwise, information is from 1982 
or 1983 proxy stataments. 
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‘Table 7 
- Coefficients for y, e Wë for a Sample of 43 Firms without 
_ Income- ahon Compensation Plans 
Income Objective 
SCH Report Income 
Avoid Losses: ` Increases 
Case 1 a i 
Income before tax and R&D exceeds income 41211938 ` += 7.038 
objective by more than the prior year’s Rain ` = 1.974 t=2.076 
p= 0.024 p=0.019 
- n=417 n= 925 
Case 3 
Income before tax and R&D is less than the 42> —1.732 4,m 13.123 
income objective t= —0.242 t= 2.782 
| | p= 0.404 p= 0.003 
i n= 34 He . 
Number of case 2 observations 22 69 
Mean dependent variable=1.160 ` 
n=473 
Adjusted R’=0.131 
Model F-statistic=11.132 ` . 
Speer 
RDEXP, a f RDEXP,.;. í SALES, , “SALES.., i 
Sch +ERROR 
RDEXP Bot Br ADEX Pa, + RDEXP... "Tranexs > Sr 





Note: ADEX., 1s tha RAD oponse and SALES, sth venus for firm Fand yar Dy aro dummy variables 
representing the cases described above. 


Results for these firms, displayed in table 7, are generally consistent with those re- - 
ported in table 3, although the estimate +, is not statistically significant (perhaps be- 
‘cause of the smaller number of observations). Thus, the results reported in table 3 are 
_ not likely to be attributable to the effect of income-based compensation plans. 


IV. Concluding Remarks 


Evidence in this study is consistent with the hypothesis that decisions to invest in 
R&D are influenced by. managers’ concern about reported earnings. Such evidence sug- 
. gests that the regulation of accounting pe has direct. and nontrivial economic 
consequences. 8 
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SYNOPSIS AND INTRODUCTION: During 1987, the climate of inter- 
national bank lending changed dramatically and prompted major restate- — 
ments of the loan portfolios of the largest U.S. money-center and regional 
banks. The circumstances involved decisions by several countries in Latin 
America—notably Brazil—to suspend scheduled interest and principal pay- 
ments on their foreign debt. The exposure of the U.S. banks became most 
visible on February 20, 1987, when Brazil declared a moratorium on interest 
payments on $67 billion of medium- and long-term bank debt and, five days 
later, froze payments on $10 billion of short-term credits and $5 billion of 
money market deposits. A chain of events began in March 1987 and 
produced the largest reported losses in U.S. banking history. 

This article examines how the stockholders’ returns of 13 of the largest 
U.S. money-center and regional banks were affected by disclosures made 
during 1987 regarding decisions to place Brazilian loans on a nonaccrual 
basis and to increase loan loss reserves to recognize the higher probability 
of default and the lower present value of future interest and principal. The 
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study adds to the recent literature on banks’ earnings sll asset relations 
(Barth et al. 1990; Beaver et al. 1989) and to the accumulated evidence on 
the role of banks’ accounting decisions in response to the resulting 
substantial asset impairment caused by the 1987 Latin American debt situa- 
tion (Elliott et al. 1989; Grammatikos and Saunders 1990; ponte? 1989; 
: Musumeci and Sinkey 1990a, 1990b). 
Using a methodology. that focuses on the unanticipated short-term 
.. effects of the announcements, we find that the stock market responded 
adversely to the banks’ reclassification of loans to the nonaccrual basis and 
positively to subsequent announcements of additions to Joan loss 
provisions. The latter reaction is viewed as consistent with banks’ use of. 
those adjustments as credible signals about their intentions and abilities to 
resolve the Latin American debt situation. 

We also find that changes in secondary market prices for Brazilian 
loans explain banks’ stockholders returns during the period and that retums 
measured over short intervals varied according to the balance-sheet . 

- amount of foreign loans. Such results are consistent with the hypothesis 
that the stock market discriminates among banks on the basis of reported 
foreign loan data. ` 


Key Words: Banks, Latin ‘American loans, Loss provisions, Nonaccrual 
l loans. 


Data Availability: Available upon request, except for secondary loan price 
data, which are proprietary to SESCH Lehman Asset 
Trading, New York. ` 7 


HE remainder of this article consists of the following: Section I reviews the ac- 

counting principles and regulatory treatments affecting accounting and report- 

ing for bank loans. Section II outlines the accounting announcements and disclo- 
sures and offers a possible explanation about how they potentially affect stockholder 
returns. We test return models of the joint effect of the accounting events in section III 
and evaluate the potential explanatory role of secondary market prices for Brazilian 
loans and balance-sheet measures of foreign loan exposure in EE IV. The results 
are interpreted in the final section. 


L Regulations Affecting Accounting and Reporting for Loans 
GAAP Accounting 


. A bank’s decision to adjust its reported forelimn loan ees is guided mainly by 
generally accepted accounting principles (GAAP). Under GAAP, banks report their 
loans at the principal amounts outstanding net of unearned income and recognize in- 
terest and other loan-related revenue as accruing over time. However, when principal 
or interest are due for 90 days or more or when payment in full is not expected, loans 
should be reclassified toa nonaccrual basis and are accounted for on 6 cash or cost 
recovery basis until they quality for dee to the accrual status, De accrued inter ` 
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est is normally reversed against interest revenue, thereby causing a drop in net income 
and loan balance in the period of the reversal.' 

Net income and balance-sheet loan balances are also affected by the allowance for 
loan losses, which reflects possible future credit losses inherent in the loan portfolio. 
Although the AICPA’s bank Industry Audit Guide (1983) provides general criteria, man- 
agement judgment ultimately dictates the assessment of this amount by relying on fac- 
tors such as past loss experience, economic conditions, international risk, and changes 
in the composition and volume of the loan portfolio. Thus, both an increase in loan loss 
provision and a change to nonaccrual status potentially decrease the net balance-sheet 
value of outstanding loans, reported bank net income, and stockholders’ =a 


Banking Regulations 


In addition to GAAP, banks must comply with regulations formulated by the Fed- 
eral Reserve Board, the Federal Deposit Insurance Corporation, and the Comptroller of 
the Currency that are reported in the U.S. Code of Federal Regulations, Title 12.2 These 
regulations stipulate various ratio calculations that are reviewed several times a year by 
the Interagency Country Exposure Review Committee. For example, federal regula- 
tions during 1987 stipulated minimum primary and total capital adequacy ratios of 5.5 
and 6.0 percent, respectively. However, except for tax effects, increases in the loan loss 
allowance do not directly affect regulatory capital ratios because the loan loss allow- 
ance is added back to the reported stockholders’ equity and assets for capital adequacy 
purposes. Nonetheless, the Interagency committee can apply higher capital standards 
to banks individually, particularly if it is believed that higher actual loan write-offs will 
occur in the future, and the committee can designate particular risk classifications 
for countries not servicing their debts that require specific reserves or write-downs for 
_ loans to those countries.’ 


II. Potential Economic Effects 


Given market efficiency and the knowledge that much information was available to 
anticipate the accounting and economic problems facing banks with Latin American 


1 For example, Citicorp’s 1987 annual report (p. 49) states: “When it is determined as a result of evaluation 
procedures that the payment of interest or principal on a commercial loan is doubtful of collection, the loan is 
placed on a cash (nonaccrual) basis. ... Any interest on a loan placed on a cash basis is reversed and charged 
against current earnings. Interest on cash basis loans is thereafter included in earnings only to the extent re- 
ceived in cash. Cash basis loans are returned to an accrual status when such loans are current as to principal 
and interest payments and future payments are expected to be made on schedule.” 

2 For minimum capital ratios that apply to all banks, refer to Title 12 of the Code of Federal Regulations (12 
CFR) Comptroller of the Currency, Treasury, Ch.1, Pts. 3.5~3.13. For specific regulations regarding inter- 
national lending, refer to 12 CFR; Comptroller of the Currency, Treasury, Ch. 1, Pts. 20.1-20.10; 12 CFR, 
Federal Reserve System, Ch. 2, Pt. 211-D; and 12 CFR, Federal Deposit Insurance Corporation, Ch. 3, Pt. 351. In 
August 1983, partly in response to the August 1982 Mexican moratorium on foreign loan payments, these fed- 
eral disclosure rules were adopted by the Securities and Exchange Commission in an attempt to standardize 
the many allowable approaches that existed at the time (Smirlock and Kaufold, 1987). For an analysis of the 
various capital adequacy ratios and their impact on accounting choices, refer to Moyer (1990). 

? During 1987, however, the interagency committee took no steps that would have preempted GAAP- 
required or management actions, and only in September 1987 did the committee suggest that the Brazilian 
loans might be “value impaired,” thus potentially requiring banks to set aside special reserves and to make pre- 
scribed increases in loan loss provisions. By that time, the banks covered in this study had voluntarily in- 
creased their Joan loss provisions and were urging the federal committee to defer designation of the more strin- 
- gent value-impaired status, despite the fact that Brazil had not paid interest for at least six months. Only in 
September did the banks initiate the issuance of new common stock to strengthen their regulatory capital 
ratios, possibly in anticipation of future writeoffs or more demanding federal provisions. 
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debt, the potential market effects at the time of the accounting announcements repre- 
sent the response to incremental information only. Consider the nonaccrual 
announcements of March-April 1987. By complying with or anticipating the 90-day 
GAAP rule, banks took a first step in formally recognizing the impaired market value of 
their loan assets as a result of the deteriorating Latin American debt situation.* Since 
reported assets and income are reduced under nonaccrual, this valuation effect should 
lead to a decline in stock price, with the change being dependent on the loan adjust- 
ments occasioned by the accounting decision. An observed negative relation between 
stock returns and bank adjustments, however, requires that future earnings and 
information about Brazilian or other foreign loans not be fully anticipated on the basis 
of available sources.’ Our methodology partially controls for such other information. 

A bank’s decision to adjust its foreign loan portfolio also depends on such non- 
accounting factors as banking regulations, Latin American and U.S. political and eco- 
nomic policies, company economic and financial strategies, and management judg- 
ment. The complexities and uncertainties of these factors make it unlikely that a bank’s 
decision to report significant loan-related losses would be based on accounting con- 
siderations alone. Thus, investors would be expected to respond not only to the re- 
ported income and asset effects but to broader implications as well, such as a bank’s 
ability to absorb the write-off and the implicit message to governments and regulators 
about future bank policy. This was particularly so with the loan loss decisions of 1987, 
when the major banks—led by Citicorp—increased their reserves to unprecedented 
levels, showing inter alia that they could well afford the substantial write-downs.° If 
perceived as strengthening the bargaining position of the banks with U.S. and foreign 
officials (e.g., by explicitly recognizing the threat of borrower default), these loan loss 
announcements reduce the uncertainty about an eventual resolution of the debt situa- 
tion, which might increase expected equity values as good news to the market. 


4 Banks’ early compliance with the 90-day nonaccrual rule, a response to the Brazilian moratorium, was 
most likely prompted, first, by Citicorp's “harder stance” announcement on March 16, 1987 of its intention to 
place Brazilian loans on a cash basis and, second, by the internal policy of Continental Ilinois requiring non- 
accrual after 60 days and instituted after the bank’s 1984 collapse and rescue by regulators (Bailey and Truell 
1987}. 

* An initial clue as to possible accounting adjustments regarding Latin American loans came from Citi- 
corp’s chairman John Reed in an interview February 4, 1987 (Truell 1987). That interview centered on Reed’s 
determination to stop making concessions to Third World debtor countries and suggested that bank revenues 
would decline substantially if Brazil were to receive restructuring arrangements similar to those given to 
Mexico and other Latin countries. It became even more likely that the individual banks would have to consider 
placing their loans on nonaccrual status when, shortly thereafter, Brazil declared a moratorium on interest 
payments February 20, 1987. Other Wall Street Journal news reports that could potentially have affected market 
expectations regarding the accounting events were also examined. Prior to May 20, the news was generally 
negative, consistent with a partial anticipation of the accounting adjustments. Examples include the lowering 
of debt ratings by Moody's and Standard & Poor’s, political developments in Brazil and other Latin American 
countries, earnings announcements, and debtor- and creditor-related restructuring decisions. However, ana- 
lysts cut their annual earnings forecasts beginning only in April 1987, after the initial nonaccrual announce- 
ments. A listing of all Wall Street Journal stories and an analysis of analyst forecast behavior during 1987 are 
available on request. 

$ A 1988 interview with Citicorp’s John Reed disclosed that he grew up in Latin America and speaks Span- 
ish and Portuguese. “I know many of the top people there,” Reed said, “I felt that I was as good a gauge as any- 
body on what to do.” (O'Reilly 1988). This, in addition to the fact that he runs the largest U.S. money-center 
bank, may help explain why Reed was the prescient leader in confronting the Latin American debt situation 
and attempting to protect his bank's revenues from pressure from the U.S. government (e.g., the Baker Plan) 
and others. 
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Tax Issues 


An economic analysis of the effects of loan accounting and reporting is also 
complicated by tax considerations. Prior to 1986, profitable banks could obtain tax ad- 
vantages by adding to the loan loss reserve. Banks could compute bad debt deductions 
for tax purposes based on a reasonable addition to the loan loss reserve to provide for 
the possibility that some loans might become uncollectible (the reserve method). To the 
extent that cumulative estimated loan losses exceed cumulative actual loan write-offs, 
banks were able to defer income tax payment on that excess amount. The Tax Reform 
Act of.1986, however, disallowed the deductibility of additions to the loan loss reserve. 
Also, for large banks, the Act ruled that the benefits of the reserve method taken prior to 
1987 would be subject to recapture. In short, after taking the restrictive effects of the 
1986 Act into account, the potential for higher loss provisions in 1987 to provide banks 
with current tax benefits and thus increase shareholder value was limited.’ 


IO. Cross-Sectional Analysis 
‘Sample 


The banks analyzed were 13 of the 15 largest U.S. banks identified in June 1987 by 
analysts at Keefe, Bruyette & Woods as having the most exposure to Latin American 
debt.* Accounting decisions by these banks should have the greatest potential to affect 
market behavior because of their size and influence. The two banks eliminated were 
Marine Midland and Irving Trust: Marine Midland was bought out by a foreign bank 
July 16, 1987, and the effects of the buyout negotiations overwhelmed all other effects 
on its stock price; Irving’s announcement of increase in loan loss provision could not be 
dated from published records. We also identified a control sample without loan loss 
announcements during the study period. From the population of the 52 largest U.S. 
banks identified by the Keefe Bankbook 1988 as having assets in excess of $10 billion on 
December 31, 1987, we deleted our initial treatment sample of 15 banks and 23 others 
either for lack of data or for announcing loan losses. The final control sample thus con- 
sisted of 14 nonannouncing large banks without significant Latin exposure. 

The 31 individual Wall Street Journal announcements are listed in table 1. Panel A 
reports the dates on which the banks announced either a consideration of reclassi- 
fication or reclassification of their Brazilian loans to the nonaccrual basis. By April 22, 
1987, all 13 banks had placed their Brazilian loans on nonaccrual status and in- 
corporated the diminished interest revenue in first-quarter results. Panel B reports 


” For example, Citicorp’s 1987 annual report (p. 41) states: “Income tax expense was recognized in 1987 
despite the $3.0 billion addition to the loan loss reserve allowance because only $259 million of tax benefits 
attributable to the addition to the allowance account could be recognized. Unrecognized benefits will be used 
in future years to reduce future income tax at the then prevailing rates.” Similarly, Chase was able to obtain a 
tax benefit of only $150 million from its $1.6 billion increase in reserves. However, banks may have been able to 
obtain limited benefits from increases in loan loss reserves by avoiding otherwise payable alternative 
minimum taxes, in part a function of the difference between accounting and taxable income (Elliott et al. 1989). 
For an empirical analysis of the (limited) tax effects of the 1987 loan loss provisions, see Grammatikos and 
Saunders (1990). ) 

* The banks studied here are: BankAmerica Corporation, Bankers Trust Co., Chase Manhattan Bank, 
Chemical New York Corporation, Continental Ilinois Corp., Citicorp, First Chicago Corporation, First Inter- 
state Bancorp, Mellon Bank Corp., J. P Morgan & Co, Manufacturers Hanover, Security Pacific Corp., and 
Wells Fargo & Co. According to Musumeci and Sinkey (1990a), as of December 31, 1986, the 13 banks held 
88.73% of Brazil's $22.4 billion in loans from U.S. banks. 
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‘Table 1 
The Wall Street Journal Announcement Dates 


Panel A. Date Bank Announced Latin Loans on Nonaccrual Basis: 


3/16/87 
3/19/87 
4/2/87 
4/3/87 
4/7187 
4/9/87 
4/10/87 


4/16/87 
4/22/87 


Citicorp . . . to consider at Board 
meeting in mid-April 

Continental Hlinois ... intention 

J. P. Morgan & Co. 

BankAmerica 

Manufacturers Hanover 

Mellon Bank 

Chemical New York 

First Chicago Corp. 

Chase Manhattan Bank 

Bankers Trust Co. 

Security Pacific 

First Interstate Bankcorp 

Continental Ilinois 

Citicorp ` 


Panel B. Date Bank Aana Increase in Loan Loss Provision: 


5/20/87 
6/21/87 


5/28/87 
5/29/87 


6/2/87 
6/9/87 
6/12/87 


6/15/87 
6/16/87 


6/17/87 
6/23/87 


7/9187 
7122187 


Loan Loss Second Quarter 
(Amounts in millions of dollars) Increase* ` Net Income* 
Citicorp . $3,000 $—2,585 
- BankAmerica claims loan losses l 

adequate 
Chase Manhattan Bank 1,800 — 1,400 
BankAmerica balks at increasing 

loan losses 
Security Pacific Corp. 500 —172 

_ BankAmerica . 1,100 —1,140 

Manufacturers Hanover... 

intention , 
Chemical New York 1,100 — 1,104 
First Interstate Bancorp 750 ` —470 
First Chicago 800 — 698 
Mellon Bank i 415 — 566 
Bankers Trust Co... . intention 
Bankers Trust Co. 700 — 554 
Manufacturers Hanover 1,700 — 1,373 
Continental Hlinois 500 —477 
J. P. Morgan & Co. : 875 — 588 
Wells Fargo & Co. 550 — 204 


Panel C. Other (Earlier) Event: 
Brazil declares moratorium on bank debt interest payments and short-term credits 


2/20/87 


835 


1987 Annual 
Net Incomet 


$—1,138 


— 896 





* From 1987 second-quarter report to shareholders 
+ From 1887 annual report to shareholders 


the dates on which the banks first announced either an intention to increase or an 
actual increase of their loan loss reserves. Those dates ranged from May 20 to July 22, 
-1987.° Overall, the loan.loss provisions reduced 1987 second-quarter’s earnings by 

about $13.59 billion. 
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Models 


The tests that follow examine whether the announcements listed in table 1 provided 
incremental information to the market. We use a cross-sectional method that identifies 
the days surrounding the announcements reported in The Wall Street Journal (the inde- 
pendent variables) and examines the extent to which excess stock returns on these days 
. differ as a function of the nonaccrual and loan loss events. Using X(.,.) to denote other 
information potentially related to a bank’s loan loss or nonaccrual notice, we state a 
general linear model l 


R,(.,.)=aNA,+ BLL, + pX(.,.) + El») (1.0) 


and examine two versions” that identify and control for other factors by a particular 
specification of the X(.,.) variable: 


R...) =aNA,+BLL,+oRM(.,.)+e.,.), and (1.1) 
HI. =aNA;+8LL,+ pRIt.,.) + EE OP (1.2) 
where for each announcement i: 


R,({p,q)=cumulative stock return from day p to day q, relative to day 0, the day of 
The Wall Street Journal publication of the accounting event; 
RM(p,q)=cumulative return for NYSE industrials from day p to day q, relative to 


day 0; 
RI(p,q)=cumulative return for NYSE financial index from day p to day q, relative 
to day 0; 


NA,=change/intention to change to nonaccrual status=1, otherwise 0; 
LL,;=change/intention to change provision for loan losses=1, otherwise 0; and 
€,=cross-sectionally uncorrelated random error." 


This cross-sectional approach differs from a more usual event study in that the 
models jointly analyze the returns for the identified event days on the basis of return ob- 
servations for the intervals surrounding only those event days. We calculate the daily 
return contemporaneous with each of the 31 announcements and examine the extent to 
which those returns differ as a function of their nonaccrual or loss loan status. For ex- 





$ All loan loss announcements appeared on the Dow Jones Broad Tape after the close of trading on the pre- 
vious trading day except for First Chicago and Manufacturers Hanover. 

‘0 These two equations provide a test of the sensitivity of the basic model to alternative specifications. Also, 
to the extent that a given nonaccrual or loan loss announcement contemporaneously affects other banks and 
these effects are reflected in the NYSE financial or industrial indexes of return, these specifications bias 
against finding nonzero coefficients. 

‘t Although not literally true, we view the error terms, ¢,, in equations (1.1) and (1.2) as independently and 
identically distributed deviates with mean zero (though each regression equation’s error term distribution 
need not be identical). We examine the reasonableness of this assumption empirically. We estimate the regres- 
sions using both ordinary and weighted least squares approaches; the weighting is based on the residual vari- 
ance from daily time-series regressions of equations (1.1) and (1.2) for each bank from January 2, 1987 to Sep- 
tember 30, 1987. We also estimate time-series models for each bank over the study period and find that only one 
bank’s residuals (Continental Ilinois) differs significantly at the 0.05 level from the other banks in terms of vari- 
ance of residuals and first-order autocorrelation. The potential effects of cross-sectional correlation are more 
troublesome since some event intervals overlap (table 1) and the inclusion of a market or financial index only 
partially removes those effects. However, an analysis of hypothetical nonevent days (see results) suggests that 
cross-sectional correlation has little bearing on the results. 
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ample, in regression equation (1.1), the independent variables for the announcement 
date regarding loan loss provision (LL) for bank i consist of event variables NA,;=0 and 
LL,;=1 and continuous variable RM as of bank i’s LL date. The more usual approach 
would analyze each event category (or each bank) separately, using returns in a non- 
- event period as the basis for estimating the error structure:'? For comparison Dee 
we report event-study results later in this section. 


Results 


Panels A and B of table 2 show the estimated coefficients and t-values of ordinary 
least squares (OLS) and weighted least squares (WLS) versions of equations (1.1) and ` 
(1.2). For example, the WLS results in panel A show that LL has a positive coefficient 
of 0.026 (t=2.912) and NA has a negative coefficient of —0.040 (t= — 3.148), each is 
‘significant at less than one percent with two-tailed tests.” Overall, these results hold 
best for the day 1 and days 0-2 return intervals and are generally consistent with re- 
search based on earlier time periods. For example, Beaver et al. (1989) report that 
_ bank’s ratios of market value to book value of equity are negatively related to nonper- 
forming loan (nonaccrual) status and positively related to.a loan loss variable." ` 

To assess the robustness of the methodology, we also estimate the cross-sectional 
regressions for each of several sets of hypothetical announcement dates whose chrono- 
logical pattern is identical to the pattern on day 0 (as indicated by the days shown in 
table 1). For example, if the hypothetical event date for Citicorp is set at day — 5, then its 
March 16, 1987 announcement is assumed to take place five trading days earlier, on 
March 9, 1987. The results (available on request) reveal that the most significant results 
are indeed for days 0, 1, and 2 and the results for the days outside these event windows 
are consistent with the predicted absence of an event 1 

We also find similar results when we examine the market’s response in a more 
usual event-study format focusing separately on the moratorium date (February 20), 
Citicorp’s loan loss announcement (May 20), all nonaccrual dates, and all loan loss 
announcements. First, table 3, panel A, shows that the market reacted negatively and 
significantly. to the Brazilian moratorium, though largely on day 1 (Monday, February 
23). Second, the average excess returns with respect to the Citicorp announcement 
(panel B) are significantly positive on day 1 (though not on day 0).*® We also find signifi- 


2 Studies by Cornell and Shapiro (1988), Smirlock and Kaufold (1987, and Se et al. (1887), for in- 
stance, use common event dates for all companies. . 
4 In addition, the results are qualitatively identical when we estimate cross-sectional model (1. 2) substi- 
ae RC(p, re for RI(p.q), where deck is the cumulative return for the control sample from day p to day q, 
ve to D. 
* Beaver et al. (1989, 169) ECH their similar loan loss result, based on 1979-1983 data, as “. . . consis- 
tent with contentions in the popular financial press that increasing the allowance for loan losses is actually 
‘good news,’ because it indicates that management perceives the earning power of the bank to be sufficiently 
strong that it can withstand a ‘hit to earnings’ in the form of additional loan-loss provisions.” However, they 
use a broader definition of nonperforming (nonaccrual) status to also include restructured debt and property 
acquired in foreclosures in lieu of loan balance. 
‘48 We also estimate models (1.1) and (1-2) without Citicorp’s initial NA and LL announcements. The results 
. dn table 2 are unchanged by the deletion of the first announcements. However, it is still the case that individual 
later announcements are unlikely to be equal in information content to the earlier ones, especially Citicorp’s. 
Citicorp issued its press release at 4:45 p.m. eastern standard time on May 19 after closing at $50.625, and 
the stock opened at 9:57 a.m. the next day at $52.75. The event appeared largely unanticipated, though Citicorp 
had informed key U.S. government officials (at the SEC, Treasury, and Federal Reserve Board) on May 18 and 
certain foreign governments (personal letters were flown in the previous weekend to the presidents of Mexico, 


em - j | ©- The Accounting Review, October 1891 


Table 2. 


Estimated Coefficients and Significance Tests for 
-Daily Return Cross-Sectional Regressions- 








Panel A. Equation (1.1): Ril» Lac BLL t uRM( rte, k. 





‘Dependent Avg. , l | 
Variable’ RJ LL BT NA t(NA) RM HRM) F-stat. Prob{F) ` 
OLS Regressions . | 
` RO) —0.0007 0.002. 0.309 -0.007 +0619 ° 0.014 0.025 0.192. 0.827 
RG) . . 90.0046 0.010 1901. -0014- -1.774 0.204 > 0462 - 1.575- 0.225 


RO, 1) 0.0039 0.013 : 1.412 -0.020 -1450 -0.057 ~0.121 1.109 0.344 
R(0,2) . 0.0097 0.026, 2.912 -0.040 —3:148 -0.359 1.221 6.377 0.005. 


-WLS Regressions* ` ie SS ) 
Dim © 000 0.004 0.662 -0.007 -0547 0.235 - 0.524 0.448. 0.643 
- HD) - §.0048 0.011 2.100 -0.016. -—2.343 0.513 `. 1.464 3.925 -.- 0.031 


RO, 1 0.0039 0.017 1.810 -0.023. .—1.891 | 0.155 -.0.385 1887 ° 0.170 
-RO,2) 0.0097 0.029 ‘3.175 = 0.042 —3.332, 0.302 1.091 8677 0.001 


Panel B. Equation (1.2): Rul..J=aNA,+BLL + pPRI( J+ edk 





Dependent Avg. . . Wée SC 

Variable Ai.) "LH ` Ou NA t(NA) | RI Van ` Feta, ` Prob.(F) 
OLS Regressions i | S 

R(0) -0.0007 ` 0.001 -0.1862  —0.008 Dänn 0.376 0.688 0.365 0.697 
R(1) - 0.0046 0.009 1656 —0.012 ~1.550 ` 0.576 1.148. 2.184 ` 0.131 


RO, - 0.0039 0.012. :1.215. -0.019 1.346. 0:213 . 0.417 1195 0.318 
R(0,2) ` 0.0097 0.024 2.730 -0.033 -2.410 0.534 1754 7473 0.003 


- . WLS Regressions” 


BI - —0.0007 0.003 0489 —0.004 —0.417 Gänn (ëmm 0.717- 0.497 
R)  . 0.0046 0.009). .1:721  -0.011.. —1.586 ` 0.842 2.227 5.600 0.008 


RO, 2) 0.0039 0.014 1.477 -0.018 -1352.. 0408 0.951 2314 ` 0.117 
` R(0;2) omg 0.027 2941 Am -2412 0449 ` 1590 9717 0.001 





R(.,.J=rate of return during the stated interval, 
NA=indicator variable for change or intention to change- fo See status, ` 
LL=indicator variable for change or intention to change twee for SH losses, . . 
RM= Ee GE l . 
__ RI=industry cumulative return for the specified interval. 
Tests of significance: 
i-value for probability <0.10 (two-tailed) =1. 701 
t-value for probability <0.05 (two-tailed}= 2.048 
t-value for probability <0.01 (two-tailed)=2.763 . ~ 
s WLS rogeson weighs ro bard on ho residual varlana from dally nels regressions for oach 
bank from January 2 to September 30, 1987. 





e Argentina, Brazil the Phippines) Alio forewarned were tho U.S. reting agencios and senior US. 
bankers (Bartlett 1987). Citicorp’s prices had declined slightly over the previous week and dropped 62. 5 cents 
between closa of trading o 15 (Friday) and May 18 (Monday). ` 
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Table 3 
Cumulative Daily Returns for One- and Two-Day Intervals 








Announcement 
Date in WSJ Days RP, 





Panel A. Initial Announcement: 


2/20/87 Brazil declares 0 —0.01000 -0.00952 —1.477 -0.01204 ` — 1.380 
(Friday) moratorium 1 0.03543 —0.01883 —2.921 -0.03659 -4.192 
e 0,1 —0.04543 -0.02835 -3.110 -0.04883 —3.940 


Panel B. Citicorp Loan Loss: 


0.00541 


5/20/87 Citicorp increases 0 8 —0.00428 —0.00294 —0.498 0.633 
(Wednesday) loan loss provision: 1 0.02844 0.01397 2.387 0.02604 - . 3.045 
Citicorp included 0,1 0.02415 0.01103 1.322 - 0.03146 ` 2.600 
Citicorp excluded 0 — 0.00878 -~0.00741  —1.288 0.00084 ` 0.112 
1 0.02649 0.01203 2.087 0.02410 2,877 
0, 1 — 0.00085 0.00461 0.566 0.02503 2.114 

Panel C. Loon Loss Announcements: l 
5/20 Intention! ` 0 0.00825 0.00481 1.302: 0.00409 1.041 
to increase in 1 0.01156 0.00826 2.190 0.00758 1.933 
7122 loan loss provision 0,1 0.01981 0.01318 2.469 0.01168 2.103 

Panel D. Nonaccrual Announcements: 

3/16 Intentlon/ 0 — 0.00046 0.00113 0.261 —0.00135 — 0.242 
' to change to 1 —0.00035 -0.00216 -0.48989  —0.00153 -0.276 
4/22 _ monaccrual 0,1 —0.00081 — 0.00103 -—0.168 -—0.00288 —0.366 


* Calculation of t-values: For the February 20, 1987, based on excess daily return, (RP,—RI.) or 
(RP, — BCL for the full study period excluding days —25 to 25 relative to February 20; for May 20, 1987 event, 
based on the full study period excluding days — 28 to 25 relative to May 20; for the LL dates, based on the 
51-day study period excluding days 0 to 1 relative to the LL date; and for the NA dates, based on the 51-day 
study period excluding days 0 and 1 relative to the NA date. 


cantly positive excess returns on day 1 when Citicorp is excluded from the calculation, 
which suggests that Citicorp is not solely responsible for this result.” Third, we docu- 
ment a strong positive response to LL in that the excess returns for days 0-1 are 
significant at less than 2.5 percent (panel C). However, the response to nonaccrual an- 
nouncements (NA) shown in panel D, though negative, is statistically insignificant over 
the short intervals. 


'? These results are generally consistent with those of Musumeci and Sinkey (1990b) and Grammatikos and 
Saunders (1990). However, the latter’s results, based on daily returns for day 0 (or days —1 and 0 if the an- 
nouncement appeared on the Dow Jones Broad Tape the previous day), suggest a heterogeneous response to 
the Citicorp announcement. By focusing on days — 1 and 0, Grammatikos and Saunders fail to detect the signif- 
icant positive response, which occurs primarily on day 1, 
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To examine the “permanence” of these announcements, we further calculated the 
cumulative average returns over a 51-day period for each of the LL and NA events and 
plot these in figure 1. The results show a marked, nontransitory response to the LL an- 
nouncements, which investors apparently impounded by day 2 at the latest, but the 
market response to NA status appears to be dispersed beyond days 0 and 1 (e.g., the 
average excess return, RP—RC, for days 0-4 is —2.7 percent, which is significant at 
less than 2 percent): The predicted effects of the NA reclassification thus appear to be 
centered only broadly around the announcement date—though this is not unreasonable 
given investors’ prior knowledge of the Brazilian debt moratorium, the 90-day GAAP 
_Tule for nonaccrual status, and other news reports (see also fn. 5). 


IV. Potential Explanatory Factors 


. Much information was available during the period of study to anticipate and con- 
dition the market’s responses to accounting announcements. Although we use stock 
-market indexes and a nonannouncing, nonexposed bank sample as potential controls, 
two other data items—secondary market loan prices and financial statement measures 
- of foreign exposure—would enable a further understanding and explanation of those 
announcement effects. We first examine the sensitivity of a bank’s returns to changes 
in secondary loan prices and whether that sensitivity might have changed when banks 
added to their loan loss provisions (the loan loss period). A finding that bank returns are 
relatively insensitive to changes in loan prices during the loan loss period would be 
consistent with the view that the observed positive response to the loan loss announce- 
ments did not offset the impairment effects of declining secondary loan prices. Second, 
we assess the extent to which banks’ returns can be explained cross-sectionally by their 
foreign loan exposure as proxied by financial statement measures. That association 
‘should be negative at a time the market revises downward the probabilities of payment 
of interest and principal (e.g., when Brazil announced its debt freeze) and positive 
when loan loss provision effects dominate (¢.g., when Citicorp announced its loan loss 
increase). ke 4 


Models 


We specify both time-series and cross-sectional models to test for potential explana- 
tory factors. Using X’ (.,.) to denote variables not captured by the secondary-price data, 
we state a general time-series model for each bank i (or a portfolio p of the 13 banks) as: 


R: «=a, +6 BZP, Lac, BZP, VW + Ens for banks i=1,. WER (2. 0) 


“and estimate two versions over the year ending September 1987 using Së? data— 
the only data available for both returns and secondary loan prices for the period of 
study. The specific equations are: 

R,,=a,+8,BZP,+7.D.BZP,+0,RM,+€, for banksi=1,...,13,and (2.1) 
Ri = 0+ BBZP, + y.D.BZP, AAR Le for banks i=1,...,13, (2.2) 


where: : 
H, ss return for baki i for month t, where t= October 1986—September 1987; 
BZP,=percentage change in secondary price for Brazilian loans during month t; 
D=1 for months June-August 1987, otherwise 0; 
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RI,=monthly return for NYSE financial stock index; 
RM, = monthly return for NYSE industrials index; and 
€,,= uncorrelated random error. 


Second, we estimate cross-sectional models based on monthly and daily return data 
applied to a common time interval or date for all banks. The monthly data are used to 
assess whether returns are differentially affected by a bank’s foreign loan exposure, 
and the daily data are used to focus on the May 20 Citicorp loan loss announcement 
and the February 20 Brazilian debt freeze—both relevant common dates on which 
banks’ prices responded significantly. For each return interval t, we estimate: 


R= a, + BFL Les for return interval t, and (2,3) 
R,.=a,+8/NAFL,+e, for return interval t, (2.4) 
where: | 


R,,= either return for bank i for month t or two-day return for bank i for the May 
20 and February 20 announcements; 
FL, = foreign loans at December 31, 1987, for bank i; 
NAFL,=foreign nonaccrual loans at December 31, 1987, for bank i; and 
€,= uncorrelated random error. 


Because the banks with greater foreign exposure are more prone to foreign loan losses, 
we should observe 8,'<0 in equations (2.3) and (2.4) during the months investors down- 
graded their. prior repayment probabilities (e.g., February 1987). However, during the 
loan loss period, particularly on May 20, the direction of the relation should be re- 
versed, given that stock prices EES positively to the broader implications of the 
events. 


Secondary Loan Prices ` 


Table 4 reports the sensitivity of monthly stock returns to E in secondary 
market prices of Brazilian loans during October 1986-September 1987. The variable 
D.BZP, estimates the potential change in sensitivity during June-August 1987, the 
period when bank stocks and secondary loan prices were potentially moving in oppo- 
site directions—in part due to the effects of loan loss announcements. The results are 
uniformly consistent across the 13 banks for both models, Secondary loan prices corre- 
late positively with stock returns, except during the loan loss period. For example, 
equation (2.1) estimates the coefficient of sensitivity for the average return of the 13 
banks as positive (0.350) and significant (t=1.759), except during the loan loss period, 
when it was negative (— 0.121 =0.350 — 0.471). Thus, a 10 percent drop in Brazilian loan 
prices in 1987 correlates with a 3.5 percent drop in bank equity prices, except during 
the loan loss period, when the positive effects of the increased provisions more than off- 
set the revaluation effects of declining loan prices,'® 


8 We chose December 31, 1987 rather than 1986 data so as to proxy better the measures of foreign and non. 
accrual loans that would have been reflected in investors’ expectations and hence returns, Using 1986 cata, 
however, had no qualitative effect on the results. 

1? We also investigated the potential relation between the 8, coefficients for variable BZP, and financial 
statement measures of loan exposure, since both proxy for a bank stock's potential sensitivity to changes in 
foreign loan values. We estimated the cross-sectional model 6,2a+w*LEP,+e,, where the loan exposure 
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Table 4 


Association Between Brazilian Loan Prices and 
Monthly Returns: Time-Series Regressions 


Panel A. Equation (2.1): Ru=a,+8,BZP,.+7,D.BZP,+0,RM.+ és 


Bank RM 
BankAmerica 0.108 
Bankers Trust 0.305 
Chemical Bank 0.722 
Chase Manhattan Bank 0.602 
Continental Bank 0.591 

Citicorp 0.058 
First Chicago Bank 0.912 
First Interstate Bank 0.739 
Mellon Bank 0.428 
J. P. Morgan 0.693 
Manufacturers Hanover 0.175 
Security Pacific 0.878 
Wells Fargo 0.413 
Portfolio (13 banks) 0.508 


Panel B. Equation (2.2): Ry=a,+8,BZP,+7,D.BZP,+06,Rl.+¢u! 


Bank RI 
BankAmerica — 0.153 
Bankers Trust 0.515 
Chemical Bank 0.838 
Chase Manhattan Bank 0.542 
Continental Bank 0.770 
Citicorp — 0.061 
First Chicago Bank 0.774 
First Interstate Bank 0.735 
Mellon Bank 0.922 
J. P. Morgan 0.814 
Manufacturers Hanover 0.188 
Security Pacific 0.827 
Wells Fargo 0.8673 
Portfolio (13 banks) 0.568 


t(RM) BZP t(BZP) 
0.089 0.827 0.863 
0.821 0.427 1.451 
2.677 0.103 0.482 
2.209 0.232 1.076 
1.016 0.062 —0.135 
0.159 0.526 1.889 
1.894 0.438 1.143 
2.087 0.407 1.438 
0.498 0.170 0.249 
1.857 0.240 0.811 

0.552 0.580 2,234 - 
1.593 0.432 0.992 
0.682 0.248 0.518 
2.029 0.350 1.759 
t(RI) BZP t(BZP) 
—0.121 0.932 0.932 
1.424 0.322 1.126 
3.314 0.018 0.089 
1.759 0.228 0.936 
1.308 —0.168  —0.361 
— 0.164 0.573 1.949 
1.426 0.449 1.048 
1.910 0.372 1.224 
1.078 —0.062 —0.091 
2.201 0.154 0.528 
0.567 0.546 2.084 
1.392 0.410 0.873 
1.106 0.117 0.244 
2.233 0.299 1.491 


H. monthly return for the ith bank, 
BZP = percentage change in secondary price for Brazilian loans, 
D=1 for months June-August 1987, and 0 otherwise, 
RM = monthly return of market index, and 
RI= monthly return for NYSE financial index. 


Tests of significance: 


t-value for probability <0.20 (two-tailed) = 1.372 
t-value for probability <0.10 (two-tailed) = 1.812 
t-value for probability <0.05 (two-tailed) = 2.228 


D.BZP 


— 1.249 
— 0.435 
~- 0.098 
— 0.272 
— 0.026 
— 0.611 
—0,.281 
— 0.781 


t(D.BZP) 


—0.783 
— 0.888 


0.148 
0.182 
0.456 


0.562 
0.513 


0.672 





proxies (LEP,) are defined as below. The results below, which use the £, coefficients estimated for equation (2.1) 
(table 4} as the dependent variable, reveal that the market-based and balance sheet measures are positively cor- 
related. The market-based sensitivity coefficients and foreign loan exposure proxies apparently reflect a 
common economic influence—e.g., a bank’s exposure to foreign debt. 





Loan Exposure Foreign Nonaccrual Brazilian 
Proxy (LEP,) Loans Foreign Loans Loans 
Dy 0.7598 13.420 8.174 
t(w*)} 1.598 3.054 1.680 
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Table 5 
Association Between Reported Foreign Loan Data and Returns: 
Cross-Sectional SE i 
Panel A. Monthly Cross-Sectional Regressions:t 
Equation (2.3) ) Equation (2.4) : 
1987 Foreign Loans’ 1987 Nonaccrual Foreign Loans 
Return Interval FL* (FL) P 'NAFL* t(NAFL) sc 
October 1986 0.033 | " 0,107 0.001 ' 0.803 ' 0.318 0.009 
November 1986 '0.123 -0.970 0.079 0.805 ` 0.751 0.049 
December 1988 -0.089 1.038. 0.089 0.747 0.842 0.075 ` 
January 1987 —0.021 - —0.127 0.001 — 0.480 —0.343 0.011 
February 1987 `- ` — 0.178 — 1.537 0.177 -~ 1.478 ~~ 1.541 0.178 
March 1887 0.134 1.087 0.097 1.070 ` 1.045 0.080 
April 1987 0.300 1.659 0.181 2.190 1.341 0.141 
May 1987 . 0.073 0.489 0.020 0.686 0.531 0.025 
June 1887  ~ 0.042 — 0.441 0.017 ` — 0,589 ` — 0.765 0.051 
July 1987 ` —0.080 —0.688 0.039 —0.394 —0.347 0.011 
August 1987 0.085 1.007 0.084 0.633 0.799 0.055 
September 1987 0.147 — 1.243 0.123 ~~ 0.941 ~ 0,634 0.074 
Panel B. Daily Cross-Sectional Regressions:t 
. Oo Equation (2.3) Equation (2.4) 
1987 Foreign Loans 1987 Nonaccrual Foreign Loans ` 
Return Interval FL* t(FL) Re. NAFL* (NAFL) P 
Citicorp Loan Loss 
5/20 ` 0.015 2.282 0.321 0.116 ` 2.133 0.293 
5/21 0.012 1,968 - 0.260 0.099 ` 1.883 0.244 
g/20-21 (2 daya 0.013 2.793 0.415 0.107 2.611 0.383 
Brazilian Moratorium 
2/20 —0.091 —1.808'° °° 0.229 — 0.571 —1.293 0.132 
2/23 — 0.093 - — 1.874. 0.242 -0.785 . ~~ 1,946 0.256 
2120-23 (2 days) —0.092 .—1.841 0.235 - —0.683 — 1.601 0.189 
Tests of significance 


t-value for probability < 0.10 (two-tailed) =1. 812 
t-value for probability < 0.05 (two-tailed) = 2.228 
t-value for probability <0.02 (two-tailed) = 2.784 
* Coefficient value x 10? 
t Cross-sectional regression equations ` ` 
R,=a,+8/ FLi+ cu for monthly or daily return ewes Equation (2.3) 
R,=a,+B/ NAFL,+€, for monthly or daily return intervals, Equation (2.4) 
where FL is foreign loans and NAFL ts no foreign loans. 


A 


Foreign Loan ae 


The association between banks’ returns and two financial statement proxies of a 
bank’s foreign loan exposure is shown in table 5. Panel A reports the results of the 
monthly cross-sectional regressions. Other than the weakly significant negative coeffi- 

cients in February (when Brazil announced its debt moratorium), the monthly data 


844 The Accounting Review, October 1991 


reveal little to suggest that banks’ monthly returns can be explained by these account- 
ing proxies for foreign loan exposure. However, a different picture emerges when the 
same models are applied to daily returns surrounding the Citicorp announcement and 
the Brazilian debt freeze (panel B). The relation is generally strongest when measured 
over days 0 and 1.? For example, the association between foreign loan exposure and 
returns is significantly positive for the Citicorp event (t= 2.793) and significantly nega- 
tive for the moratorium announcement (t= — 1.841). 

Overall, then, the results indicate that banks’ market returns were generally influ- 
enced by secondary market prices for Brazilian loans (table 4) but, more specifically, 
they were differentially affected by the reported amount of foreign loan exposure (table 
5). At the time of the Citicorp loan loss announcement, stock prices increased as a func- 
tion of the magnitude of banks’ reported foreign loan exposure; at the time of the Brazil- 
ian moratorium, stock prices decreased in proportion to reported foreign loan expo- 
sure. Such results suggest that the stock market discriminates among banks based on 
reported foreign loan data. Finally, we note that the effects of declining secondary loan 
prices during 1987 enhance the finding that the market responded positively to loan 
loss announcements since under more normal circumstances the adverse loan revalua- 
tion effects would have shifted banks’ values in the opposite direction.”! 


V. Summary and Conclusions 


This study has presented evidence about the market effects of nonaccrual status 
and loan-loss addition disclosures on the stock returns of 13 of the largest U.S. money- 
center and regional banks, The results provide insights into the effects of bank disclo- 
sures that depend on accrual accounting decisions and on management strategy and 
policy in response to an issue of major economic and political significance. 

We found that, on average, the capital market responded negatively to banks’ news 
of changes in the nonaccrual status and positively to their subsequent announcements 
of additions to the loan loss provision. Whereas the former result supports the view 
that such news provided incremental information about the impairment of Brazilian 
and other foreign loans, in the second instance the effects of such value changes were 
eclipsed by a positive market response to the increase in loan loss provision. We inter- . 
pret this latter result as consistent with the notion that the loan loss addition provided a 
credible signal to investors about banks’ intentions and abilities to resolve the Latin 
American debt situation favorably. 

We also found that secondary loan prices and reported balance gheat amounts of 
nonaccrual or foreign loans correlated significantly with common stock returns during 
the study period—the greater the measure of foreign exposure the stronger the price 
adjustment. Results based on daily returns show that balance-sheet loan data partially 
explain not only the positive market response to Citicorp’s May 20 increase in loan loss 
provision but also the negative response to the February 20 Brazilian debt freeze. 


"7 Musumeci and Sinkey (1990a) and Elliott et al. (1989) report similar results for the February 20 and May 
20 Te respectively, using daily stock returns for samples of banks that include the 13 banks 
studi ere. 

D As a percentage of face get of the loan, Brazilian Joan secondary market prices dropped from 75.25 
(January 1987) to 46.25 (December 1987). In June, July, and August 1987, the average percentages of face value 
were 62.0, 59.5, and 53.0, respectively. 
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Figure 1 l | 
Cumnltive Excess Returns Surrounding Loan Lose and Nonaccraal Announcements 
Panel A. Loan Loss Announcements: 
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+ Cumplative daily residual return as of day T= "yas r(RP. ~RI,), where t= 25,.. ee eee 
loan Ioss/nonaccrual announcement (day 0}; RP,=equally-weighted average daily return of 13-bank sample; 
and RI,=daily return for NYSE financial stock index. 

+ Cumulative daily return difference as of day T=2,«-as..{RP,—RC,), where t= 28... -T relative to first 
news of loan loss{nonaccrual announcement (day Ok RC,= average daily return for control sample relative to 
first news of loan loss/nonaccrual announcement. 
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SYNOPSIS AND INTRODUCT ION: In this article, we examine the 
"Information content of announcements of increased reserves for loan loss ` 
by Citicorp and other banks, and the later write-off announcement made by 
the Bank of Boston. During 1987, most major U.S. banks, led by Citicorp 
on 19 May 1987, announced large Increases in their loan loss reserves 
because of problem loans in lesser developed countries (LDC). With 
substantial flexibility in accounting rules for determining loss exposure, the 
banks announced varying levels of reserve increases. On 14 December 
. 1987, the Bank of Boston began a second round of activity relating to LDC 
‘debt by announcing a $200 million write-off ot LDC loans. and further 

increase in-loan loss reserves. : 
- Financial reporters suggested that these events could be interpreted | 
differently. ‘Because Citicorp was a leading money-center bank, its 
announcement could be interpreted favorably as a signal of willingness to 
deal with the LDC debt problem. This interpretation could similarly apply to 
other banks, especially the more exposed money-center banks. In 
comparison, the Bank.of Boston announcement was portrayed in the press 
as detrimental to the money-center banks for two reasons. First, unlike a 
- reserve increase, a write-off reduces a bank's capital adequacy ratio. 
Capital adequacy ratios are used by bank regulators in determining the 
need for, and the level of, supervisory intervention. Second, the write-off 
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was construed as an effort by regtonal banks to exploit: thelr relatively ` ` 
-limited exposure to LDC loans as a competitive advantage In the domestic ` 
. banking market. ` 
We find evidence consistent with the expectations of the financial 
press. The strongest stock-price increases associated with both the Citicorp 
announcement and the subsequent announcements of reserve increases by - 
other banks were found for the banks with the greatest exposure to LDC 
debt..In contrast, those banks with the greatest exposure to LDC debt and 
with the largest reserves sustained the largest stock-price decreases at the 
Bank of Boston write-off announcement. The larger money-center banks 
sustained, on average, a three-day decline in value of 5 percent around the 
Bank of Boston announcement date. 


Key. Words: Banking, Loan losses, LDC debt Discretionary write-offs. 
Data Availability: Data are available from d authors. 


HE article is structured as follows. In section L saa reviaw bedk accounting tor 
loan losses and describe the three events studied: the Citicorp announcement, the — 
set of individual announcements, and the Bank of Boston. announcement. This 


8 section is followed by hypotheses development, sample selection, and discussion of test 


procedures in section II. The results are discussed in section II. A summary of the find- 
ings is provided in. section IV. l 


L Events Examined 


Banks, by regulatory action during the period under study, were weg ER | 
` to maintain a minimum capital level of 5.5- percent." The capital adequacy ratio is de- . 
fined as (stockholders’ equity plus loan loss reserves)/total assets. Since the loan Joes 
reserve is added back to stockholders’ equity for the purpose of calculating the capital 
adequacy ratio, a loan loss provision has no impact on the ratio. Only the. write-off ofa — 
loan reduces the ratio and therefore has the potential to cause a violation of minimum 
_ legal capital requirements. The US. Comptroller of the Currency has: expressed com- 
cern that this calculation of the capital adequacy ratio might cause a bank to defer writ- 
ing off bad loans that were previously reserved because of the threat of additional regu- 
latory restrictions?‘ 

McNichols and Wilson (1988) examined the effects of a reserve KSE on net © 


f ` income, assuming reserve increases are attempts to smooth income. Little discussion of 


the actual write-off of a bad debt was contained in that study since the write-off had no 
` direct implication on financial ratios. As has been indicated, bank accounting proce- 


-= dures, unlike that of other industries, provide for a direct interpretation of loan loss 


DEE separate from the effects of the annual loss provision ES net income, 


E During the period of our atudy, tho general requirement wai 

.,8.5 percent. We find no evidence of deviations from this level for any bank in our sample. 

|. -2 Write-offs would be predictable, at least in an accounting sense, given reserve roland: Ine ease EH 
Been the bits ee ease gees éée 

the threat of additional regulatory intervention. 
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Beaver et al. (1989) provide evidence that banks’ market values are cross-sectionally 
correlated with characteristics of their loan loss reserves. We extend their study by 
examining whether changes in loan loss levels are related to changes in market values. 
Three events from 1987 are of interest in this study. They are: (1) a $3 billion increase in 
Citicorp’s reserve for foreign loans, (2) subsequent increases in reserve levels by 45 
other banks, and (3) the announcement by the Bank of Boston of a second increase in re- 
serves and the first significant write-off of foreign debt by a major U.S. bank. 


The Citicorp Announcement 


On 20 February 1987, Brazil declared a moratorium on interest payments on $67 
billion of medium- and long-term debt. Most large banks subsequently announced they 
would no longer accrue interest on their Brazilian debt. The action by Brazil also in- 
creased pressures on the banks to reconsider how they should value the debt on their 
books. Citicorp was the first to react by announcing (19 May 1987) a $3 billion increase 
to its reserve for losses on LDC debt. The increased reserve levels represented 25 per- 
cent of the LDC debt in its loan portfolio and 39 percent of its preannouncement market 
value. Citicorp’s stock price dropped 3.1 percent on May 19th in anticipation of the 
announcement, which was broadcast on the Dow Jones News Service {the Broad Tape) 
after close of business at 4:45 p.m., but the price rebounded 10.1 percent during the 
next two days.” This price rise was attributed by the popular press to EES of a 
strategy by Citicorp to deal with its LDC debt problem. 


Announcements by Other Banks 


The Wall Street Journal reported 20 May 1987 that the Citicorp action forced banks 
with lesser resources to decide whether they should and could follow the lead of the 
nation’s largest bank.* The weakness of some large banks was viewed as a motivation 
for regulators to allow each bank to make its own evaluation of necessary reserve 
levels. Indeed, BankAmerica immediately announced it would not follow the Citicorp 
lead. However, by 24 July 1987, 45 major U.S. banks, including BankAmerica, had an- 
nounced substantial increases in their loan loss reserve levels. 


Bank of Boston Announcement 


At 4:15 p.m. on 14 December 1987, the Bank of Boston announced an additional in- 
crease in its loan loss reserve of $200 million, classification of $470 million of LDC debt 
into nonaccrual status, and the write-off of $200 million of LDC debt. This write-off was 
the first LDC-related action that significantly reduced a bank’s capital adequacy ratio. 


* A 10 percent increase in market value is roughly equivalent to $764 million (absent tax effects). 

4 An article headlined “Big U.S. Banks Seen Boosting Loan Reserves” cited bank analysts’ predictions that 
other banks “will have to dramatically boost reserves” following a one-day rise of $2.50 in Citicorp’s share 
price (5 percent) and sharp declines in other bank stocks around the world. “In Big Board trading yesterday, 

with stronger reserves, such as J.P. Morgan & Co. and Bankers Trust New York Corp., were up slightly, 
while others, such as Manufacturers Hanover Corp., fell.” While analysts predicted compliance by other 
banks, officers of other banks responded with statements about the need to review Citicorp’s action. 
“Citicorp's circumstances are quite different from our own. ... An increase in our reserve isn't in any way 
- automatic; the (Citicorp) decision creates pressure to consider, not pressure to act.” Other articles included 
“BankAmerica’s Profit Hopes are Crimped by Citicorp's Debt Move, Cost Estimates”; “British Banks Pressed 
to Increase Reserves for Third World Loans”; “Bank Issues Are Sold Off World-Wide In Wake of Citicorp’s 
Loan-Lass Move”; and “Citicorp’s Move to Boost Loan-Loss Reserves Wins Praise, May Lead to Better 1988 
Profit.” 
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However, even after the write-off, the Bank of Boston’s capital adequacy ratio was ap- 
proximately 8 percent. The Bank of Boston’s-stock-price rise of 9.9 percent for the three 
days beginning 14 December 1987 was attributed to the noua ementa signal of the 
bank’s financial strength. . 

On 16 December 1987, The Wall Street Journal reported that the Bank of Boston an- 
. nouncement would have wide repercussions because most major money-center banks 
could not follow the Bank of Boston write-off without violating minimum capital re- 
quirements. The potential impact on smaller banks was less severe. For example, in 
January 1988, Ss Bank SE began selling off LDC loans at 50 cents per dollar of 
loan. 


E? Hypotheses E and. Sample Selection ; 


The Citicorp and Bank of Boston Announcements 


. Both announcements may have been interpreted as steps in a coordinated effort y 
U.S. banks to- -resolve the LDC debt crisis. The decisions can be viewed both ás asset- 


-© valuation disclosures and as indications of plans to limit future commitments. to.the 


LDC until existing problems were resolved. Consistent with this BEE the: ane 
. hypothesis tested is: , 


H,: Abnormal returns around each gë for all banks were, on average, 
significantly greater than zero. 


Many accounts in the popular press during 1987 highlighted differences in the in- 
formation-transfer effects of each announcement on other banks, depending on such 
factors as status as a money-center or regional bank, capital adequacy ratio, relative 
level of exposure to the LDC debt problem, and existing reserves for LDC loans. Inter- 
pretation of intraindustry information transfer may be conditional upon noe or 
which reese the following hypotheses: 


: Abnormal returns due to information transfers around the Citicorp (Bank of 
" Boston) announcement were significantly higher (lower) for money ene 
banks than for regional banks.* 
H: Abnormal returns for all banks around the Citicorp and Bank of Boston an- 
_. nouncements were significantly correlated with characteristics of the firms’ 
~ Joan portfolios, reserves, and economic strength. 


Since the effects of some explanatory variables are Seet on rs bank’ 8 char 
acteristics, the effects of differing economic characteristics of the bank are captured by 


* Lawrence Cohn, an analyst at Merrill Lynch & Co. who views Bank of Boston as a regional bank, stated 
‘that “it's a parting of ways-between the money-centers who are out to collect on their assets and the regionals 
who want to get out.” James I. McDermott, Jr., Director of Research at Keefe Bruyette Woods, Inc., added that it 
created an enormous quandary for many weaker money-center banks whose more binding legal capital con- 
straints made it difficult to follow the Bank of Boston. Although it was unlikely that violations of these capital 
lavels would cause the bank to “fail” immediately, it would bring in bank examiners, whose regulatory scrutiny 
would: be at the least embarassing, and would involve significant negotiating costs.in addition to whatever 
-explicit constraints on behavior an examination might occasion... 
ê The 12 banks designated as money-center banks are Citicorp, Chase Manhattan, First Chicago, Conti- 
nental Ilinois, First Interstate, BankAmerica, Wells Fargo, Manufacturers Hanover, LD. Morgan, Bankers 
Trust, Security Pacific, and Chemical Bank. The Bank of Boston is considered to be a regional bank. 
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using a multiple regression of the form: 
R[-1,+1),=8o+6:EC;+64LDCPART, +8, MVBV, +6, RE3,, > D 
where: o | 


SE —1,+1];= the ċumulative abnormal return for bank j fomi day —1 before dis- 
` closure in The Wall Street Journal to.day +1 after the article; 
EC,=[(stockholders’ equity at tee loss reserve at 12-31-86) /total 
assets at 12-31-86] — 0.055; 
LDCPART,=total exposure to lesser developed See at 12-31-86/total assets at 
l 12-31-86; ` 
- MVBV,=ratio of market fe to book value as of most recent quarterly finan- , 
cial statements; and 
RES, =(1986 LDC reserve + the first 1987 addition)/) LDC exposure whore the 
1987 addition equals zero at the Citicorp date. 


_ {Note that both MVBV, and RES, take on different values at different regression dates.) 

The variable EC measures the excess legal capital level of the. bank at he end of 
1986. A higher level of EC reduces the danger and cost of violating a bank’s legal capital 
requirements in charging off LDC debt losses. Therefore, 6, is expected to be positive. 
' LDCPART reflects the level of exposure to LDC debt.” In conformance with popular 
press. arguments, the ability to match either the Citicorp or Bank of Boston action is 
‘constrained by the magnitude of a bank’ s involvement in LDC lending. Banks with 
lesser exposure were, all else equal, able to incur larger percentage provisions and per- 
Ge write-offs of LDC debt. Thus; 8, is expected to be negative. 

The ratio of market value to book value (MVBV) is included to:capture the iien 
8 Ger perception of a bank’s general financial health, We assume that book values are 
consistently measured across banks. MVBV is calculated prior to each of the relevant 
announcements by using the most recent quarterly. financial information then avail- 
able. Concurrently, for each bank, market values are measured as market price per 
share times the estimated shares outstanding, on the basis of disclosures in the state- 


ments of owners’ equity. Because a bank with better relative overall financial strength - 


-should be better able to withstand the financial consequences of incurmiag large loan 
losses, we expect a positive 6, coefficient.’ 

RES provides a measure of the existing reserve for LDC debt prior to the date of the 
relevant announcement. A larger value for this ratio indicates that a smaller addition to 
the loan loss reserve is needed to match the new level established by the Citicorp an- 
nouncement. Assuming it was costly for a bank to increase reserves, a positive coeffi- 
cient is expected on this variable at-that date. However, -at the Bank of Boston date, a 
negative coefficient is EES because larger reserves.at that date imply e write 
offs-and declining bank values. | , 


7 LDC exposure Svd widely in the E Sample statistics for LDCPART were mean (0. 036), median 
(0.025), standard: deviation (0.028), minimum (0.003), and maximum (0.101). 

* LDCPART and MVBV are negatively correlated (Pearson p= ~ 0.53). Regressions were re estimated by omit- 
ting each variable, and the signs, magnitudes and significances of the coefficients were examined. No substan- 
tial shift occurred in magnitude; sign, or significance. We conclude that multicollineerity is not a significant 
problem insofar as bias of coefficient estimates is concerned. 
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Individual Bank Announcements 


Much of the speculation after Citicorp’s announcement related to possible future 
actions by other banks. The financial press suggested that loan composition, existing 
loan loss reserves, capital levels, and tax effects might all affect a bank’s decision to fol- 
low the Citicorp lead or not. Discussions of other banks’ abilities to follow Citicorp’s 
lead predicted negative returns for the more LDC-loan exposed money-center banks at 
the Citicorp date. However, subsequent announcements by these banks of significant 
reserve increases might have indicated willingness and sufficient strength to deal with 
their LDC problems. Thus, generally positive price reactions to the individual an- 
nouncements are expected, with the most positive price reaction expected for the 
money-center banks. We, therefore, offer the nouns — using t-tests for dif- 
ferences: 


H,: Abnormal returns for all banks were significantly greater. than zero around the 
_ date of their announcements of loan loss reserve increases. 
H;: For the more exposed money-center banks, abnormal returns were signifi- 
cantly greater around their announcements than for the less BE regional 
banks. 


To examine the effects of individual banks’ charset on the market interpre- 
` tation of their loan loss reserve increases, regression. (1) is estimated by aligning all 
banks’ returns on the date of their individual announcements. This allows testing of the 
following hypothesis: 


Hs: In event time for individual announcements, the coefficients from regression 
(1) should be negative for EC and MVBV and positive for LDCPART and RES. 


_ Sample Composition | and Abnormal Return Measures 


Of the 81 banks with: foreign loans in excess of $100,000 included on the 1987 ver- 
sion of the COMPUSTAT bank tape, we selected those banks that: (1) reported informa- 
tion on foreign loan and loan loss reserve composition in their 1986 and 1987 annual 
reports,’ and (2) announced increases in their loan loss reserves during 1987." The final 
sample contained 46 banks, including both Citicorp and Bank of Boston (reported in 
appendix 1 together with some descriptive data). . 

In table 1, descriptive statistics are reported for the sample banks with a ET 
between money-center and smaller banks since the two groups were expected to differ 
in their response to the LDC debt problems. Table 1 shows that money-center banks had 
an average of 7 percent of their assets in LDC debt compared to less than 3 percent for 
smaller banks. For both groups of banks, the average capital adequacy ratio exceeded 
the prescribed level of 5.5 percent (shown as EC, excess legal capital) by approximately 
1 percent. Although total AVETARS reserves of money-center banks (1.3 ECH of total 


9 We do not differentiate between countries within the broad category LDC debt. Although many banks dis- 
closed on a country-by-country basis, the practice was not sufficiently widespread to allow a large sample size 
of consistently defined data. Also, finer partitions would reduce the degrees of freedom available for our analy- 
sis. 

1° We identified 45 bank announcements of increased reserves folowing the Citicorp announcements. . 
son banks made a second announcement in paniya 1988. We ogus on the amount and date of the first such dis- 
closure. 


Elliott, Hanna, and Shaw—Evaluation by the Financial Markets 853 


Table 1 


Descriptive Statistics on the Sample Banks’ Loan Loss Reserves 
and Independent Variables from Regression (1) 





Money-Center — Other Test of Difference 
, All Banks Banks Banks Between Money-Center 
Variables n=46 n=12 n=34 and Other Banks* 
Panel A. Descriptive Information: 
Mean Total Assets Pa : 
l at 12-31-86* 32,132.5 , 75,556.4 16,808.4. ! 5.04 
f l (0.00) 
Mean Loan Loss Reserve Ratio - — paii l "aod 
at 12-31-86° 0.010 0.013 0.010 _ 8.39 
o ` "moon 
Mean 1987 Loan Loss , ‘ l l 
Provision Ratio" : 0.017 0.022 0.015 : 3.01 
(0.00) 
Mean Stockholders’ Equity Ratio l SR 
` at 12-31-86°: 0.057 0.052 0.059 -2.71 
l l Dou 
Panel B. Variables Used in Regression: 

BC 0.012 0.009 0.013 —1.76 
(0.08) 
LDCPART” 0.036 0.087 0.025 4.07 
l - i (0.00) 

MVBV! ) ' -1.310 0.908 1.450 -3.39 

eR SE 7 | po 
RES! NE 0.368 0.308 0.390 | — 1.38 
| | (0.17) 

- Note: Both RES and MVBV take on different values at different regression dates. These are values as of the 


-Citicorp date regression. 
* Z-statistic from Wilcoxon signed-rank test, Significance level in parentheses. A positive Z-statistic 
indicates a larger value for money-center banks. 
* In millions. . 
° Scaled by total assets. S l 
* EC=[(Stockholders’ equity at Daora loss reserve at ‘saison assets at 12-31-86]—0. 055. 
* LDCPART =Total exposure to lesser developed countries at 12-31-86/total assets at 12-31-86. 
1 MVBV=Ratio of market value at 3-31-87 to book value at 3-31-87: 
' + RES=(1986 LDC reserve + first 1987 addition)/LDC exposure: 


assets) were significantly hgher (a< 0. .05) than the average’ of small banks (1.0 percent) 
at 31 December 1986, their relative reserves for LDC debt were smallér.. The average 
reserve on LDC debt after the first round of 1987 additions was 30.6 percent for money- 
center banks and 39 percent for smaller banks. In contrast, at the beginning of 1986 the 
average reserve level for money-center banks (not ee was only 5.6 percent, com- 
pared with 8.2 percent for the smaller banks. 
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- ‘To assess SEN effects at each an we e calculated EE returns for each 
bank from the following return model: 3 


l Bann, zl tem l B 
where: ka 


R,,=return € on bank j’s stock on. day t; 
a, =bank j’: 8 intercept; - 
8,;=bank j’s response coefficient to the market; 
R= market return on day t; 
pire: Te response coefficient to changes in. interest rate; 
,=daily change in interest rates measured as the percentage change in the re- 
_- ported yield to maturity on five-year dree from day t—1 to day t; and 
€ „=normally distributed error term. 


The bank-specific coefficients GC 8,, and 7,} were estimated from the first 50 
_ trading days of the final quarter of 1986. This period was chosen to minimize the effects 
of the Brazilian actions in early 1987 on the parameter estimates.” Returns were ob- 
tained from ‘the CRSP tapes or, for the smaller banks, were calculated from price and 
' dividend data in Standard and Poor’s Daily Stock Price Record. The interest rate vari- 
able was included to control for common macroeconomic effects on the banking indus- 
- try.” The daily abnormal returns for each SE -were then calculated as: 


TEE Ras =le (3) 
_ II. Results | 


In examining the effect of each announcement on bank equity prices, we E EE 
and report daily abnormal returns for a period of two days before and two days after 


_ The Wall Street Journal publication of the announcement. Regression (1) was estimated 


_for a three-day window centered on the announcement since substantial stock-price. 
‘movement occurred for the three days for both Citicorp and the Bank of Boston. We 
concentrate o on the information-transfer characteristics of these two dates. In contrast, ; 


"To mitigate concern over ET shifts in coatfictonts over this Seiad) we also calculated the coeffi- 
cients for the conventional market model (&, 4} from the last quarter of 1986 and the first quarter of 1988. 
Comparisons using paired t-tasts did not reveal a statistically significant difference in the parameters. Our esti- 
mated coefficients were similar on average to values. reported in. other studies of banks returns using a two- 
- factor model (6.g., Brewer and Lee 1985). 

2 This interest rate variable has been gea in several banking DE fe. g., Kane and Unal 1988). To assess 
its contribution, we combined our sample firms with return data available on CRSP into one portfolio, and into 
_ large- and small-bank subportfolios. We regresses these portfolios’ daily returns on R alone and on the two- 
factor model in equation (1). The R? measures rose slightly nom the interest factor was included. The t-statistic 
on y; exceeded two in all three cases {a< 0.03}. 

D Although two-day windows have become standard in Se studies, we report three-day windows, de- 
fined as days — 1 through +1. Here the major events were announced after market closing on day. —1, but 
information leakage is always a possibility. Ex post, changes were observed for day +1 for Citicorp’s and Bank 
` of Boston’s:‘own stock prices around their announcement dates. Thus, the three-day window results are re- 
ported. However, we did replicate the study u using the two-day-windows (— 1,0) and (0,+.1). The(— 1,0) window 
suggests stronger, more significant, negative stock-price reactions to the Citicorp announcement; especially 
for smaller banks. The positive effect for large banks is concentrated on day +1. However, equation (1) still had 
' no explanatory power for the shorter windows. All other inferences and results were E unchanged 
` when using the two-day windows. 
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‘Table 2 
Markt Reaction to the Citicorp Announcement: The Minimal Information Transfer Caso 
Citicorp; Day 0 is 20 May 1987) 





Panel A. Abnormal Returns Surrounding the Citicorp Announcement: 





All Banks on Money-Center SC Other Banks 
' Day RETURN . . t-statistic ©. RETURN ~ t-statistic RETURN ‘tetatistic: 
2 `- —0.0055  —2.03 -0.0045 A0 0008 "180 
cl + 0.0009 ` —0.33 -0.0017 . -034 >- +0.0006 ~0.20 . 
0  — 0.0084 -2.398 © -0.0048 -0.898 ` —0.0070 -2.17 ` 
+1. `. 0:0081 © 2.24 0.0180 app ` ` 0.0023 0.70 
.+2 -| 0.0002 0.07 -0.0073 | =1.44 0.0026 0.81 
BA. wë 008 , "—-Oä8 Dog ~ 190 "0084 "008. 
Panel B. E E | Explanatory Factors; Adjusted B= ~0.00 - 
eier 
BL-3. +1] e + BEC? + GLIRART: + BMVBV" + BRES 
- 0.005 -0.168 . -0.090 | 0.001 - 0.073 
(0.22) (0.269) {(-044) "DO ` {(—0.93) 


* Average abnormal stoe for daya Ae + 1 whera day O is tho date tho reserve Increase was disclosed 
in The Wall Street Journal. ` 
k EC [stockholders equity at 12-31-86 + loan loss reserve at 12-31-86}/total assets at 12-31-88]— 0.055. 
' -* LDCPART =Total exposure to lesser developed countries at 12-31-86/total assets at 12-31-86. 
4 MVBV= Ratio of market value (3-31-87) to book value (3-31-87). 
* RES=1986 LDC reserve/LDC exposure. , 


for the individual bank announcements, we examine only the effect of each firm's an- 
nouncement on its own stock price and use regression results to associate these. reac- 
tions with the characteristics of the announcing banks in event time.'* As described be- 
low in detail, -we find minimal information transfers at the Citicorp announcement, 
- significant: bank-specific effects (on.average) at the individual announcements, and. < 
significant information transfers at the Bank of Boston date. rg 


The Citicorp. Announcement: -The Minimal Information Transfer Case 


To test for information transfers at thë Citicorp announcement, we report t-tests of 
- average abnormal returns for the other 45 banks. The t-statistic is based on a portfolio 
approach that takes into consideration cross-sectional correlation.in residuals (see 

‘Brown and Warner 1985, 2. for discussion).. The.results are presented in panel A of 
table 2. On average, the announcement had no significant effect on returns of other 
banks during the three-day window (—0.13 percent, a=0.78). That is, the first hypoth- 
esis is not supported. However, the average abnormal return for. the money-center ` 
banks (1.14 percent) was significantly greater (a#<0.03, one-tailed test) than that of the. 
regional banks (—0.54 percent). This result supports hypothesis H, and is consistent 

* All regressions are tabulated and discussed relative to estimations that treat all firms equally. We also 
estimated regressions by allowing slope oe ETO ee ee ee ae eee 
banks, No differences in interpretation resulted. 
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Table 3 ` 


Market Reactions to Individual Bank Announcements: The Bank-Specific Effects Case 
{Excluding Citicorp; Day 0 is Individual Benk Announcement Date) 





` Panel A. Abnormal Returns Surrounding the Individual Bank Announcements: PAE 
All Banks Money-Center — Other Banks 


n=45 n=11 . n= 34 
Day RETURN t-statistic RETURN t-statistic. RETURN t-statistic 
—2 — (0.0010 . ~ 0.38 - 0.0012 0.23 - 0,0017 . ~ 0.64 
-1 0.0040 - ` 149 0.0035 -0.70 0.0042 1.30 
0 -~ —0.0010 - ~0.36 0.0016 0.32 © —0.0018 -0.56 
+1 '0.0071 S 2.62 - 0.0141 2.79 i 0.0049 .: 4.51 
+2 0.0001 -0.04 0.0040 0.78 = —0.0011 | ~0.35 


R[-1, +1} 0.0102 . 2.16 - -0.0192 2.20 0.0073 1.30 


Panal B. Rassgeg of Potential Explanatory Factors; Adjusted eg 


{t-statistics in Parentheses): o? BW 
- Ri-1,+1] = Bo + §8,.EC*.+-8,LDCPART* + §,MVBV‘! + BRES 
0.023 0.690 0.087 `. —0.012 —0.022 
(1.11) (1.16) (0.45) /  (—1.30) -(—0.98) 


* Average abnormal returns for days —1 to +1 whare day 0 is the date the reserve Increase was disclosed 
in The Wall Street Journal. 
* EC=((stockholders equity at 12-31-86 + Ioan loss reserve at 12-31-86)/total assets at 12-31-86] —0. 055. 
€ LDCPART = Total exposure to lesser developed countries at 12-31-86/total assets at 12-31-86. 
d MVBV=Ratio of market value to book value before announcement in calendar time at most recent end 
EE (3-31-87 or 6-30-87). 
RES =(1988 LDC reserve + first 1987 addition LDC exposure. 


with the view that the EE was an n effort by the highly exposed BEE 
banks to deal with their LDC exposure problem. 

However, as shown in panel B of table 2, estimating regression (1) does not support 
. any strong inferences about information transfer, which is contrary to hypothesis Hy 
and to the popular press‘allegations. Thus, the information content of the Citicorp an- 
nouncement appears to. be specific to Citicorp. 


The Announcement by Other Banks: The Bank-Specific Effects Case 


- The announcements by the other firms extended in calendar time from the Citicorp: 
date to 24 July 1987. The 45 announced increases ranged from 0.04 percent of total 
assets (Suntrust Banks) to 2.3 percent (Manufacturers Hanover), and from 4.4. percent 
(Republic NY Corp.) of foreign loans to 69.6 percent (Valley National Corp of Arizona). - 

Citicorp raised its LDC reserve to approximately 25 percent of its LDC debt. This 
initial value may have shaped the market’s expectations for other banks. By year end, 

- the LDC reserve level for money-center banks averaged 30.6 percent of LDC exposure 
and ranged from 20.2 percent (BankAmerica) to 57.4 percent (First Interstate Bancorp). 

Of the 35 smaller banks, 27 exceeded the original 25 percent Citicorp level after their 
1987 increases. 

In panel A of table 3, we report average daily abnormal returns for days ~2 to +2 in 


Elliott, Hanna, and Shaw—Evaluation by the Financial Markets | 857 


event time around the other 45 banks’ loan loss reserve announcements. The three-day 
average abnormal return of 1.0 percent is positive and significant (a@<0.04), which is 
consistent with hypothesis H,. The 11 money-center banks had a significantly positive 
return of 1.9 percent (a<0.05). These results are consistent with assertions that the re- 
serve announcements were viewed most favorably for the weaker money-center banks, 
as posited in hypothesis H,. However, the difference in abnormal returns between the 
money-center and regional banks is not significant at the 0.05 level. Also, as shown in 
panel B of table 3, hypothesis H, is not supported. None of the coefficients from regres- 
sion (1) is significant in explaining return variability. 

One explanation for the weak results for the individual Ge is that they 
were anticipated. Once Citicorp announced, the possibility of announcements by other 
banks became more likely. Each subsequent announcement, therefore, could: have | 
changed expectations regarding others, KREE the information content of the later 
announcements. 


The Bank of Boston Announcement: The Information-Transfer Case 


- Panel A of table 4 reflects systematic and significant information transfers from the 
Bank of Boston announcement to other banks. The three-day average abnormal returns 
were significantly negative for both the money-center banks (~ 7.3 percent) and for the 
regional banks (— 1.4 percent). These negative returns are inconsistent with hypothesis 
H, but consistent with hypothesis H, since the returns for money-center banks were sig- 
nificantly more negative Lo 0.081 than for the regional banks. ` ` 

The regression estimates shown in panel B of table 4 relate the strong pattern of in- 
formation transfer regarding the relative health of the other banks at the Bank of Boston 
` announcement. The significant negative coefficient on LDCPART (82) indicates that 
negative returns are associated with large LDC exposure, while the positive coefficient 
on MVBV (83) indicates that the announcement increased the perceived differences be- 
‘ tween the weaker and stronger banks. The coefficients on EC and RES were not signifi- 
cant. The estimated regression explained 0.36 percent {adjusted R?) of the cross-sec- 
tional variability of returns, and is consistent with hypothesis Wl 


Effects of Confounding Factors 


Such a significant negative reaction at the Bank of Boston announcement could 
also indicate the presence of other, confounding, information being released concur- 
rently. We inspected The Wall Street Journal for the period, but found no other informa- 
tion released that would explain such a strong reaction. We also estimated abnormal 
returns for the period for a control sample of 52 banks that were excluded from our ini- 
tial sample because they had no LDC debt exposure. These banks were generally much 


n E (1) does not capture the possible, though subtle, effects of taxes on these banks. Although the 
loss reserves are not tax deductible after the 1986 Tax Reform, they affect GAAP earnings and therefore also 
affect the spread between taxable earnings and GAAP earnings. After 1986, this spread is a potential tax pref- 
erence item under the,alternative minimum tax (AMT). We read the tax notes for all of the banks and con- 
structed a crude proxy for AMT consequences but found it insignificant as an additional term in the linear re- 
gression. It was not significantly associated with three-day returns in an univariate sense elther (Spearman 
correlation, p=0.16, a> 0.29}. 

‘6 Leverage plots and other diagnostics were examined to assure that regression results at each of the dates 
were not disproportionately affected by a few observations. Similarly, daily and three-day RESCH average 


returns were examined to verify that results were not driven by a few cases. 
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Table 4 


l WEE 


Markat Roactions to Bank of Boston Announcement: The Information Transfer Cen 
i Excluding 








Panel A. Abnormal Boken Surrounding the Bank of Boston Announcement: 
All Banks | 


_ Day 





Money-Center 


Bank of Boston; Day 0 is 15 December 1987) 


Ottar Banks 


RETURN  tetatistic 
-2 -0.0013 WË —0.0033. ~0.64 0.0030 0.83 
' =l - —0.0085 ` —3.13 -0.0131 -2.60 —0.0068 2.11 
o —0.0202 -7.43 -0.0682 1881 | —0.0027 —0.85 
+H —0.0009. . ` —0.34 . 0.0087 1.73 `  —0.0044 -1.37 
= ` 0.0143 5.27 ` 0.0053 1.05 ` ` 0.0176 §.44 
SA Ze 0000 9-629 A0 ` ` —8.30 -0.0140 - © —2.80 
Panel B. Regression of Potential Siet Factors; Adjusted PA 
(t-statistics'in Parentheses}: 
R[-1, +1] = Bo + PEC) + §,LDCPART’ + B,MVBV* $ BRES" 
008 ` 1133 -0.889 0.058 —0.045 - 
(-1.77) (114). (-2.83) -+ - (2.88) (—1.10) 





. Average aimi sata for days ~ito +1, where day 0 is the date the reserve increase was disclosed 
in The Wall Street Journal. 
+ EC=[(stockholders equity at 12-31-86-+loan logs reserve at 12-31-B6ytctal assets at Stacie 
e LDCPART=LDC loans at 12-31-86/LDC at 12-31-66. 
- * MVBV=Ratio of market value (9-30-87) to book value (9-30-87)... 
wë RES =(1986 LDC ege 1987 addition LDG exposure, 


salle: than "ER included in our ms The average ‘abnormal return for these 
banks for the three days around the Bank of Boston announcement was significantly 
positive at 1.3 percent, suggesting that the negative information transfer to LDC banks ` 
was not the outcome of general macroeconomic events." 


IV. Conclusion 
We interpret the. SIS, to SH consistent with. EE transfers from an- 


ix nouncements by leading firms to other industry members. Cross-sectional characteris- 


_ tics that weakly explained increasing stock-price. movements when reserves were 
created by Citicorp were strongly negatively associated with the returns to these banks 
when the Bank of Boston later announced a large write-off. The Bank-of Boston write- 
off was accompanied by an additional increase to the Bank of Boston reserve, so the 
salient features may include the net increase in projected losses. In spite of the large 
_ size of the announced Citicorp reserve. increase and the significant 10 percent increase 

in its associated stock BS Sé We information transfers are observed for this an- 
nouncement. 


o» This result is: DEE E debt expo- 
sure (or; in this case, EE 
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Our results overlap in various parts with work by Griffin and Wallach (1991), Gram- 
matikos and Saunders (1988), and Johnson (1989). In those areas of similarity, our re- - 
sults are broadly consistent. Our results on bank loss disclosures confirm the existence 
of systematic information transfers related to accounting announcements and supports 
the findings of Foster (1981), Clinch and Sinclair (1987), and others. At the same time, 
we confirm the weak and erratic: nature of such transfers. 
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MEMORIAL 
William Joseph Vatter 
(1905-1990) | 


Every science, methodology, or other body of knowledge ts DER to some conceptual 
structure—a pattern of ideas brought together to form a consistent whole or a frame of ref- 
erence to which is related the operational content of that field. Without some such inte- 
grating structure, procedures are but senseless rituals without reason or substance; 
progress is but a fortunate combination of circumstances; research is but fumbling in the 
dark; and the dissemination of knowledge is a cumbersome process, if indeed there is any 


“knowledge” to convey. iia vee 
_—Willlam r 


The Fund Theory of SE and Its Implications for Financial Reports, page ons, para- 
graph one. 


ILLIAM Joseph Vatter, the son of john $ Vatter, Sr. aa Elizabeth Renzenbrink 
W Vatter, was born in Cincinnati, Ohio on January 6, 1905. He died in Oceanside, 

California on September 15, 1990. After graduating from Hughes High School in 
Cincinnati, he spent two years at the University of Cincinnati and the Cincinnati Con- 
servatory of Music before leaving school to go to work. 

Early in life, Vatter aspired to be a professional musician. At age seven, fi received 
his first violin from an older brother. In high school, he played the viola in its orchestra. 
At the Conservatory of Music, he studied the French horn (and other wind instruments) 
with Modest Alloo: Alloo’ s departure to join the music faculty at Berkeley left Vatter “un- 
settled.” 

(Details of Vatter’s Say life come principally from letters written by William and 
Rose to Charles Horngren who received his Ph.D. under Vatter at Chicago. Professor 
Horngren graciously supplied me with copies of this correspondence, and some other 
materials, as soon as he heard that I was to write this Memorial.) 

He tried his hand at playing in theater orchestras, but the introduction of talking pic- 
tures in the late 1920s and early 1930s sounded the death knell of the pit orchestra in the 
movie houses. i 

In the course of his. mate activities, he met Rose Schumacher of Batesville, 
Indiana, who introduced him to the study of accounting. He found a job as a junior audi- . 
tor with Singer Sewing Machine Company in Cincinnati. He and Rose were married 
August 2, 1930, a lifetime union, during which they had two children, Gretchen and 

Within a year, in the depths of the Great Depression, Singer combined its Cincinnati 
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and Indianapolis agencies, and Vatter, as a junior member of the staff, was out of work. 
He found it impossible to get a regular job, so he accepted an.offer from Rose’s uncle to 
lend them enough money for William to prepare himself for the CPA examination. The 
Vatters moved to Oxford, Ohio, where both of them attended Miami University. William 
graduated from Miami in 1934 where he was elected to Phi Beta Kappa and, at age 29, 
received his bachelor’s degree, summa cum laude, with a major in accounting. For a few 
_ months, he worked as an auditor in the Indiana Gross Income Tax Department. Mean- 
while, an instructor at Miami University resigned because of poor health; William was 
asked to teach accounting, beginning in the fall term. 

In 1936 Vatter left Miami for the University of Chicago where he saved until 1957 as 
graduate student and faculty member. In 1936, he held the title of Instructor; in 1957 he 
was Professor of Accounting and Production Control. From 1942 to 1944 he was Director 
of Finance at the Metallurgic Laboratory at Chicago, which housed the “Manhattan Pro}- 
ect” that developed the first atomic bomb. He received his Ph.D. in 1946; his dissertation 
was published that year under the title, The Fund Theory of Accounting and Its Implica- 
tions for Financial Reports. It has gone through several printings, and has been translated 
into Japanese. In this monograph, he employs as a central idea the concept of a “fund” 
which is an area of operations, a center of interest, or a center of attention. This protean 
concept unifies profit and not-for-profit accounting, encompasses divisional as well as 
combined and ‘consolidated financial statements, and national income accounting. As 


one corollary, it does not make income or profit determination its center or focus. 
His interest in managerial accounting resulted in a collection of materials prepared 


for student use. In 1950 Prentice-Hall persuaded Vatter to let-them publish this "pre 
liminary edition” of Managerial Accounting. It was reprinted 18 times before he called a 
halt in 1962. Horngren tells us that “every cost or managerial accounting book pub- 
lished in America during the last 30 years shows the influence of Managerial Account- 
ing. LE 

In 1957 Vatter left Chicago to join the. accounting faculty at Berkeley, where he re- 
mained until his retirement in 1972. While on leave from Berkeley, he made a study of 
accounting education in Australia at the invitation of the leading accounting bodies in 
that country. The study was published in 1964 under the title, Survey of Accounting 
Education in Australia, and received the rare accolade of having many of its recommen- 
dations adopted by the professional bodies. Later, in 1971, Richard D. Irwin, Inc. pub- 
lished his Accounting Measurements for Financial Reports. This text grew out of his ex- 
tensive experience, both at Chicago and at Berkeley, in teaching accounting to MBA 
candidates without.any previous coursework in the field. Vatter compressed a typical 
year’s work into a half-year, in large part by using a strict, cash-based system to intro- 
duce the essentials of a double-entry system. He had no accruals whatsoever in his first 
two chapters, but still was able to present transactions in double-entry form, sum- 
marize them, prepare flow statements; and introduce other concepts SES broaching 
such ideas as the difference between asset and expense. 7 
_: ACPA in Ohio since 1936, Vatter was a member of the American Institute of CPAs, 
the National Association of Accountants, the Financial Executives Institute, and the 
American Accounting Association. He was a member of the AAA’s Committee on Con- 
cepts and Standards, acting as its chairman, 1962-1963; in 1970-1971, he was one of 
the Vice-Presidents of the AAA. 

Additional honors include two Purest ae for work in Australia; the AICPA 
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Accounting Literature Award in 1966; Honorary Fellow of the Australian Society of 
Accountants; Alpha Kappa Psi Award in 1948; the Berkeley Citation in 1972; Fellow of. 
-the Accounting Researchers International E and Dotetening Educator 
Award of the deich in 1984. 


.,. _ Some Recollections 


_ The Journal i Business of April 1937, published an article of Vatter’ 8 ‘entitled | 
“Depreciation Methods of American Industrial. Corporations, 1927-1935.” I was im- — 
‘pressed by his approach—he tested a.statement from The.Accountants’ Handbook that . 
said, unequivocally, that the straight-line. method of depreciation was well-nigh uni- 
versal in the U.S. He éxamined the financial statements of a large number of companies 
and looked for the relationship of the amount of depreciation taken to other variables, - 


` such as -the investment in depreciable plant and equipment, and sales. The closest he 


could find to a “straight-line” relationship was. with "gross profit’”—the depreciation 
charge seemed to rise and fall with gross profit more closely than with any. other vari- - 
able. Sweeping statements, even authoritative statements, were not to be accepted with- 
out some proof, without some testing. 

He and I did not meet until we both served on the AAA’ 8 Committee on Concepts 
and Standards in the 1950s. Prior to Bill’s arrival on the Committee, the other members 
. and I took issue with each other, but we waited until the speaker finished .a sentence 

before setting him straight. The first morning with Bill on board was a nightmare—as 
soon as one of us opened our mouths, he jumped in to take the floor. By lunchtime, I 
was exhausted. By sheer luck, we sat together for the noon meal. He turned’to’ me and 
said, “I think we made real progress this morning, don’t you?” I smiled, because I now 
realized how to handle this “wild man” from the University of Chicago. That after- 
noon, I jumped op Mim as soon as he opened his mouth! He quieted down, and became 
one of the most constructive members of the Committee. By that time, y we had become 
good friends, and remained so to the end of his life. 
. One puzzler to me was his failure to mention his early interest in geet In‘my case, 
J learned about it only after he died. What makes this a puzzle is that I have played the’ 
violin since I-was ten years old, and started on the instrument in Cincinnati, as did Bill. 
As it turned out, he and I knew many of the same people and the same places. True, he 

was five years older than I, and he was professional, and I was and still ‘am a strictly : 
Grade B amateur. Still; he knew about 1 my interest in the violin, but never revealed his. I 
can only surmise that the blunting during the Great Depression of his ambitions to be- 
come a full- time Ge EE him so much that he found it distasteful to talk 
_ aboutit. -~ 

Bill apparently lived according to his GE In 1957 when he left Chicago to 
come to Berkeley, he told me that he was the last of the faculty at Chicago to be on the 
. “fixed-income” plan pioneered by Robert Maynard Hutchins when he was President of 
that University. Under this plan, faculty members were given salaries substantially 
higher than the “market” rate in exchange for contributing their outside income, such 
as fees for. consulting services, lecture honoraria, book royalties, etc. to the University. 
` Bill liked this arrangement because he viewed himself as a scholar full time, He never ~ 
looked for or solicited consulting engagements and, as we have already seen, his highly 
successful textbook in managerial EES was, to him, an unexpected by-product of -` 
his scholarly’ activities. o 

The last time I saw Bill was in the summer of 1989 in Oceanside, California. My 
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wife and I had lunch with Bill and Rose, and spent most of the afternoon with them. Bill 
was in poor health then, but his mind was sharp. He could barely speak above a 
whisper, but he kept throwing questions at me about events in the accounting world. 
He even wanted to know if it was true that an accounting MBA student at Berkeley was 
not required to take a course in accounting theory in order to get a degree. I told him 
that, unfortunately, it was true. He was incensed, even agitated. How was it possible, he 
asked, to send someone out into the world as an accountant, from a leading university, 
without rigorous exposure to some e conceptual or integrating structure, or some frame 
of reference. l 
Here he was, 84 years old, 17 years into retirement, in ill health, without responsi- 
bility or obligation in the matter, still battling as though he were fighting the good fight. - 
As we parted, he reached out to embrace me, as though he knew we would never see 
. each other again. 
Maurice Moonitz. : 
SEH of California at Berkeley 
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Shane R. — Editor 


Editor’s Note: Two copies of books for review should be sent to Professor 
Shane R. Moriarity; School of Accounting, University of Oklahoma, 
Norman, OK 73019. The policy of the Review is to publish only those 
reviews solicited by the Book Review Editor. Unsolicited reviews will not 
be accepted. 


VICTOR A. CANTO and ARTHUR B. LAFFER, editors, Monetary Policy, Taxation, 
and International Investment gege (New York: ena Books, 1990, pp xli, 
325, $49.95). 


Although this book is really a collection of what appears to be excerpts from a series of 
articles on supply-side economics, almost all of these papers were written by one or both of the 
editors. In this sense, Canto and Laffer are really the authors of this book rather than merely 
editors. Certainly, most of the policy proposals are theirs. 

The book is divided into three segments. The first segment, dealing with monetary policy, 
serves basically as an introduction to the second segment on taxation. The basic premise of this 
section is that inflation is not caused by increases in economic activity (which might eventually 
strain capacity), but instead is a monetary phenomenon of too much money chasing too few 
goods. This.problem can be solved by increasing production. Thus, to prevent inflation, a 
~ country should follow an economic policy which encourages economic growth. ` 

Canto and Laffer propose tax policy as the primary tool for encouraging economic growth. 
-Thus, the second segment of the book emphasizes the importance of low levels of taxation, 
particularly for capital gains: Unfortunately, the arguments put forward in defense of reducing 
the capital gains tax are not well-developed. 
| The problem is that the authors really believe that “capital gains is not income” (p. 147), and 

therefore the tax rate on capital gains should be zero. But most of their discussion of the need 
for reducing the capital gains tax addresses other issues—inflation, bracket creep, the lock-in 
effect, and so on. Since these are not the real reasons they favor reduction, they spend relatively 
little time in developing any of them fully. As a result, they tend not to be convincing. | 
_ This section of the book also tends to be repetitive, highly partisan, and often contradictory. 
It is not uncommon for entire paragraphs to be repeated verbatim within the space of a few . 
pages Leg, pp. 125 and 148, 129 and 140, 129 and 142, etc.). At one point, they accuse George 
Mitchell of being “single-handedly” responsible for the stock market crash of October 1987 
(p. 157). Only three pages later, however, proposed legislation in the House Ways & Means Com- 
mittee headed by Dan Rostenkowski is targeted as “perhaps the single event that triggered the 
1987 crash.” And, though they criticize Congress for a lack of continuity in formulating tax 
policy (p. 150), their proposal for capital gains tax reform would implement a “one-time, one- 
year reduction [of the capital gains tax rate] to 15 percent” (p. 143). This reduction would then 
be followed by a prospective indexing scheme which would offset the effects. of future inflation 
-in measuring capital gains. 

The ideas developed in the segment on tax policy also form the basis for the authors’ port- 
folio strategies, such as they are. Essentially, the preniise of this book is that a favorable tax 
environment leads to favorable business conditions. Thus, they outline a “portfolio strategy” 
which can be summarized in two sentences. First, investors should include in their portfolios 
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stock in domestic companies which are headquartered in low-tax states, and exclude stock in 
domestic companies headquartered in high-tax states. With regard to international companies, 
portfolios should exclude companies based overseas when the U.S. real exchange rate appre- 
ciates relative to its trading partners, and include these companies when that rate depreciates. 
In closing, reading this book is rather tedious. For the most part, it tends to be a somewhat 
rambling collection of loosely connected thoughts. Part of the problem is that it is not really a 
book, but excerpts from a series of papers stitched together around an unfocused theme. But the 
problem is deeper than that. Some of the chapters are repetitive, and others seem out of place 
being in the book at all. If the editors find an idea interesting, they include a chapter on it, re- 
gardless of its contribution to their central argument. If they find the idea really interesting, they 
repeat it several times. If a reader shares Canto’s and Laffer’s views before reading this book, he 
or she probably will continue to share them afterward. But those readers who do not agree, or 
have not developed an opinion, are not likely to be convinced of the merits of their proposals as 
a result of reading this book. 
ROBERT C. RICKETTS 
Assistant Professor of Accounting 
Texas Tech University 


HUGH M. COOMBS and J. R. EDWARDS, Accountability of Local Authorities in Eng- 
land and Wales, 1831-1935 (New York: Garland Publishers, Inc., 1990, Volume 1, 
pp. xvi, 288; Volume 2, pp. ii, 425, $134.00). 


This work is an anthology of materials relating to the development of statutory regulations 
on the degree of accountability required from local governmental units, in Great Britain, be- 
tween 1831 and 1935. Its authors’ stated purpose is “to provide a comparative basis for work on 
the private sector and to provide an historical perspective for the system of local government 
accounting presently in use” (p. i). It does this by making available, to persons interested in the 
evolution of accounting and auditing principles, a wealth of information. Since state and local 
governmental units in the United States have borrowed many accounting practices fram their 
eg counterparts, this two-volume work should be of interest to American, as well as British, 
` readers. ; 
. The first volume starts off with a brief summary of the development of local authority ser- 
vices before 1831 and the level of accountability required from persons providing these services, 
as well as an outline of some background developments that occurred during the period covered 
by this anthology. The introduction is followed by the following tables: 


1. A select list of relevant acts of parliament passed during this period; 

2. A select list of relevant government committees, royal commissions, and returns made to 
Parliament during this period; 

3. Annual reports of the Poor Law Commissioners (1834-1848), the Poor Law Board (1849~ 
1871), the Local Government Board (1872-1919), and the Ministry of Health (1920-1935); 

4. Local taxation returns made to Parliament (annual reports of revenues and expenditures, by 
category, of local authorities); 

§. Orders, circulars, and instructional letters issued by the Poor Law Commissioners and their 
successors on keeping and auditing accounts; and 

6. A bibliography of books and articles dealing with the structure and accounting practices of 
local authorities. ) 


The next section contains accounting and auditing extracts from relevant acts of Parlia- 
ment. The provisions repealed by later regulations are also shown, so that the reader will be 
aware of the requirements in force at different points in time. Of particular interest is the Local 
Government Act of 1933, which brought together regulations relating to various aspects of local 
government. A detailed discussion of the origins and significance of each section and 
subsection of this act is provided. 

Also found in Volume 1 are the published financial statements of the Rhondda (Wales) Ur- 
ban District Council for the year to 31 March 1907 and summarized financial statistics for 1904- 
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1905 for various types of local Ge units in England and Wales. This volume con- 
cludes with a. summary of accounting and auditing regulations in force in 1873,°1907,.and 1932. 
Volume 2 consists of extracts from evidence collected and reports:prepared by royal com- 
` missioners and select committees, which focus on the:accountability of local authorities. These 
extracts are organized chronologically, within the categories of poor Io. public health and ` 
. sanitation, municipal corporations, and local government. This classification scheme leads to a 
certain amount ‘of unavoidable overlapping between sections, but it enables the reader to iden- 
‘tify relevant:material more easily than a simple chronological presentation would. j 
. . While.this work seems daunting at first glance, it is really quite well laid out. The concise 
_ history at the beginning of Volume 1 provides an excellent overall perspective: The order of pre- 
-sentation of materials is logical and reproduction ranges from fair to excellent. My only com- 
plaint is the manner of (or lack of) indexing, especially in Volume 1. Sincé most.readers will be 
looking for specific information, a detailed index at the front and a subject index at the back of 
- each volume would be. very helpful. 
The authors are to be commended for successfully bringing together a soa deal of difficult- 
to-obtain information. Persons performing in-depth research into the antecedents of present day 
accounting and. auditing practices will find this work particularly useful, Those with a cursory | 
interest in accounting, as wall as a history, will also find this work interesting, especially- 


Volume 2. 
: JOSEPH R: RAZEK. `: 
: ee _ Professor of. -Accounting 
University of New Orleans. 


H. e HEYMAN and ROBERT BLOOM, Woor Cost in Finance and Accounting 
(New York: Quorum Books, 1990, pp. xv, 199, $45.00). 


The book achieves its. main objective, which is to gedeien 
_ within the context of business problem solving. ‘This objective is achieved in a clear and: easily 
readable manner although there is frequent use of séntences containing 40 or more words; In 
‘addition, approximately 40 percent of the text presents repetitive material. ` ` 
| These two observations are not intended as criticisms. Long sentences are the consequence 
- of creating'a masterful blend of economics, finance, and accounting theories, with special atten- 
_ tion to the use of the opportunity cost principle in the derivation and implementation of these 
` theories. Frequent repetition: of this material: helps the reader to:'(1) remain focused on the key 
- issues discussed, (2) follow arguments that are based on various economic theories and models, 
and (3) maintain continuity of conceptual foundations. - 

The first third of the book is devoted to a réview of classical economics and finance theories 
because most decision making methods and general management procedures are grounded in 
the economic equilibrium models (e.g., the net present value’ Ges and other ‘firm valuation 
_ models). Under the pcg of ideal markets for goods and services, the absence of opportu- 

nity costs defines market equilibrium, while their existence defines a disequilibrium condition. 
Also, the inadequacy of accounting information in problem solving becomes apparent since ` 
- accounting measures reflect only explicit costs; whereas, economic decisions must be based on 
. explicit, implicit, and opportunity costs. 
e The next third of the book is devoted to the. application of various economic theories to 
financial decisions. Most textbooks, as well as some real life. matiagerial decision-making meth- 
_ ods, are based on normative models and not on practical problem solving approaches. By re- 
viewing these familiar exercises one can see their limitations and understand why they fall 
within the theoretical domain of analytic model building (normative and positive). Thus, another 
objective of the book, the separation of the theoretical from the practical domain of problem 
solving, is achieved. 
. The final third of the book builds pu e now apparent EE l 


| - optimization models. The systemic model building EE is introduced where isolated, long- 


run market equilibriums for production, investment, and financing decisions do. not exist. In- 
stead, any solution to a particular problem is interrelated with other problem areas of a given 
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organization and optimum solutions are replaced with continuous equilibrium solutions. In 
addition, the deterministic view of passive decision making is replaced with experimental learn- 
ing processes, active management procedures, and strategies for reducing organization-wide, 
generalized opportunity costs. Thus, the final objective of the book, the clarification of issues 
associated with the opportunity cost principle and its utilization in the formulation of strategic 
management procedures, is achieved. 

In this type of decision making environment, business problem solving becomes concerned 
with continuously adapting the entire organization to changing economic and legal conditions 
and creating strategies that reduce overall opportunity costs unique to the entity. The general- 
ized (systemic) opportunity cost principle explains the rise of new managerial processes (e.g. 
JIT and quality circles) and the acceptance of. decision criteria not expressed in dollars {e.g., 
quality and service). 

The book concludes with a discussion of decision support systems. These systems are per- 
fectly suited for systemic model building and generalized problem solving since they are used to 
reduce implicit and opportunity costs, as opposed to explicit costs. They give managers the abil- 
ity to swiftly react to unexpected events and plan to reduce opportunity costs through simula- 
tions. 

The use of this book with advanced undergraduate and graduate courses in financial man- 
agement and management accounting will enhance the students’ understanding of the origins 
and limitations of many finance and accounting concepts. However, the price may constitute a 
_ major drawback. Also, the discussions pertaining to a virtues of certain economic theories 
should lead to much lively debate. 

| ARA G. VOLKAN: 

! Professor of Accounting 
West Georgia College 


J. EDWARD KETZ, RAJIB K. DOOGAR, and DAVID E. JENSEN, A Cross-Industry 
Analysis of Financial Ratios: Comparability and Corporate Pë (New 
York: Quorum Books, 1980, pp. x, 218, $49.95). 


The analysis of financial statements often focuses on comparing financial ratios of a given 

. company with those of other firms in the same industry or for the same company across time. In 

such comparisons, specific ratios are picked from commonly-accepted groupings in order to 

achieve parsimony in the analysis. Not surprisingly, academic researchers have often been inter- 

ested in the development and validation of taxonomies of financial ratios. Most of the extant 
studies have focused on the manufacturing and retail sectors of the economy. 

This book, to a considerable degree, replicates this literature on financial ratio classification 
schemes. It examines the ratio classification patterns for manufacturing and retail firms-for the 
ten-year period from 1978 to 1987. In an approach essentially unchanged since the pioneer- 
ing study by Pinches, Mingo, and Carruthers (Journal of Finance,.May 1973, pp. 389-98), the 
authors use factor analysis to develop the taxonomy. 

The book, however, does extend the existing literature in some areas. First, the analysis per- 
formed in the book offers an in-depth taxonomic scheme for seven industry groups: automobile 
and aerospace; chemical, rubber, and oil; electronic; food; retail; steel; and textile. Second, it 
compares the results obtained when different factor analysis methods such as common factor 
analysis, iterated principal factor analysis, and maximum likelihood factor analysis are used. It 
also examines the effect of different rotation techniques, namely, the orthogonal approach {vari- 
max, quartimax, and equamax rotations) and the oblique approach (orthooblique and promax 
rotations) on the results. The effect of these differences in the technique applied are discussed at 
length in the relevant sections. Finally, congruency coefficients have been.computed for both 
time series and cross-sectional assessments to determine the stability of the taxonomies. 

In addition to the extensions of the existing literature in these areas, the period examined 
(1978 to 1987) is potentially interesting because of the high rate of inflation (and interest rates) in 
the 1978-1980 period, the deflationary environment of 1981-1983, the deep recession in 1980- 
1982, and the strong economic expansion during the 1983-1987 period. The great diversity of 
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economic conditions in the period covered offered the hope that some intuitive or theoretical 
explanations might be offered for changes in the taxonomies over time, or for interindustry 
differences observed. 

Regrettably, the authors provide very little rationale (theoretical or speculative) for their 
findings. For example, readers are told that 1981 appears to be a “break year” for the food 
industry because the factor patterns for that year have low congruency scores compared to 
those of the surrounding years (pp. 121, 128). Yet, no insight is provided as why this is the case. 
Is the reader to assume that the recession which began in that year had a differential impact on 
the food industry? As another example, the reader is told that there are two separate return fac- 
tors for the retail industry: a return on sales factor, and an unrelated return on assets and debt 
factor (p. 141). Unfortunately, the authors provide no clue as to why these two factors exist for 
the retail industry but not for other industries. 

The book, however, has its strengths. The authors do a commendable job of providing back- 
ground information on the industries covered in the study. They provide interesting statistics on 
the level and volatility of these ratios over the time span covered. They also discuss possible rea- 
sons for the observed interindustry differences in the level and volatility of these ratios. 

Overall, I recommend the book to researchers who are seeking some justification to pick 
only a few parsimonious ratios rather than the slew of financial ratios which have typically been 
used in the past. The book should also be useful as a supplement to more standard textbooks on 
financial statement analysis. The authors identify differences in ratio taxonomies by industry, 
and also find many of the industry-specific taxonomies fairly stable across the different eco- 
nomic environments. This temporal stability indicates that the ratlo groupings reported might 
be useful in specific industry analyses. : 

YAW M. MENSAH 
Associate Professor of Accounting 
_ Rutgers University at New Brunswick l 


MATTHEW LESKO, The Federal Data Base Register (Kensington, MD: Information 
USA, Inc., 1990, pp. lv, 571, $125.00, paper). 


The first page of the Register lists 14 departments, 40 independent agencies, two sections of 
the Judicial branch, the executive branch, and six legislative groups of the Federal government 
included in the Register. All of these government units have data bases that can be accessed by 
the public via print, magnetic disk, or online search capability. How one sorts out what informa- 
tion each department has available for distribution, to whom to direct questions, what it costs to 
access this data, and how to access the data is a task made humanly possible by The Federal 
Data Base Register. 

The Register, now in its third edition, generally provides a timely, one-paragraph description 
on approximately 2,500 studies and reports that are available through the departments and 
agencies of the government. The descriptions seem to be geared to a nonacademic audience and 
provide few of the details that are needed by serious researchers. Each entry typically contains a 
summary of the files contained in the data base, the time periods covered, the approximate num- 
ber of observations, the method of delivery, the cost, and who to contact for the data. Given the 
limited space devoted to each summary, vague phrases commonly appear that may leave the 
researcher with more questions than answers about what is contained in the data base. 

The strength of the book is that it sorts through a maze of Federal data in the public domain. 
The book lists a table of contents that identifies data bases by department. In addition, a limited 
index of selected key words helps direct one to the relevant data base. The book is clearly an 
essential part of any government documents section of a library. How useful the Register is to 
researching and practicing accountants is a different question and leads me to a different con- 
clusion as to the usefulness of the text. First, a more detailed review of the Register reveals that 
very few departments of the government devote time to the study of business. Thus, only a small 
portion of the Register is useful to the research that currently appears in most accounting jour- 
nals, The Departments of Commerce and the Treasury as well as the Securities and Exchange 
Commission (SEC) and the Federal Reserve System are likely to provide data of interest to ac- 
countants. For example, the Register gives a description of the Electronic Data Gathering Anal- 
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ysis and Retrieval System (EDGAR), a pilot project begun in 1984 to electronically collect and 
disseminate financial data. To date, over 49,000 filings have been received for over 1,200 regis- 
tered firms. However, the only public access to the data is through the public reference rooms 
the SEC maintains in Chicago, New York, and Washington. Another potential source of useful 
data is corporation records, terminations,-and filings made to the Pension Benefit Guaranty Cor- 
poration which can be obtained under the Freedom of Information Act. - 

One caveat is that even if the source of data can be found and the data purekassd there is 
no guarantee that the data, once obtained, can be easily accessed. As distinct from other. com- 
mercial data base sources, the Government makes no claim that it will maintain a strong custo- 
mer support department or that it will provide users with extensive documentation or a custo- 
mer department to field questions about how the data was collected and how to resolve any 
problems such as computer access. 

JOSEPH WEINTROP 
Assistant Professor 


Dal 


State University of New York at Buffalo 


J. C. VAN DIJK and PAUL WILLIAMS, Expert Systems in Auditing (New York: Stock- 
ton Press, 1990, pp. 192, $140.00). 


The authors state that this book “has been written primarily for the practising auditor... , [If 
reading this book helps him gain an understanding of the potential of expert systems for the 
auditor’s practice, and of their limitations, then it has been successful” (pp. 9-10), I do not be- 
lieve that the authors attain that objective. 

The book is divided into three sections. Part 1 contains six chapters about artifictal intelli- 
gence (AI) and expert systems. The authors begin with an ambitious survey of the history of AI 
in which they link AI to ancient (Pygmalion) and medieval. (Golem) attempts to create artificial 
human beings. Although many interesting facts are presented, their development is often incom- 
plete. For example, the reader learns that Turing developed a test to prove whether a machine is 
intelligent, but the nature of that test is not explained. Instead, the reader learns that Turing 
worked on cryptology in World War I. Some of the definitions of terms are cryptic. For exam- 
ple, knowledge representation is defined as “the representation of the domain knowledge in the 
knowledge base” (p. 189}. The explanation of concepts is also uneven. On the one hand, the 
authors successfully use an analogy to explain that using a shell to build an expert. system is 
similar to using a spreadsheet to build a financial model. On the other hand, only one sentence 
(accompanied by a confusing diagram) is devoted to explaining the concept-of frames, 

_ Part 2 consists of 11 chapters that describe the various steps in the audit process and 
explore the possibilities for building expert systems to perform those tasks. The authors use a 
cost-benefit analysis based on the identification of bottlenecks’in the audit process to guide 
their selection of suitable projects. The discussion of audit activities is not likely to-be new to the 
intended audience (practitioners), but the chapter on the risk-based approach to auditing may be 
of interest to academics who do not teach auditing. Examples of expert systems developed ‘by 
academics and practitioners are presented throughout these chapters. Although the descriptions 
are short, Appendix C provides a good bibliography to enable the reader to follow-up on any 
system that appears interesting. - 

The book’s major contribution to practitioners (its intended audience) probably comes in 
Part 3, which consists of four chapters that discuss important issues that should be considered ~ 
when thinking about implementing expert systems. Among the topics covered are: (1) the effect 
of using expert systems on the development of audit expertise, (2) the importance (and cost) of 
knowledge base maintenance, (3) the need for and difficulty of vahgatiag expert systems, and 
(4) how to manage the development of expert systems. 

This book would probably not be of interest to readers who are , familiar with expert systems 
because most of the information is not new. Readers who have not kept abreast of expert sys- 
tems developments, and practitioners, may find some useful information in the book but will 
probably be disappointed in the lack of depth in which most topics are covered, 

PAUL JOHN STEINBART 
Associate Professor of Accounting 
Memphis State University 
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Footnotes: Footnotes are not used for documentation. Textual footnotes should be used only for extensions 
and useful excursions of information that if included in the body of the text might disrupt its continuity. Foot- 
notes should be double spaced and numbered consecutively throughout the manuscript with superscript 
Arabic numerals. Footnotes are placed at the end of the text. 


SUBMISSION OF MANUSCRIPTS 
Authors should note the following guidelines for submitting manuscripts: 


1. Manuscripts currently under consideration by another journal or other publisher should not be submit- 
ted. The author must state that the work is not submitted or published elsewhere. 

2. Inthe case of manuscripts reporting on field surveys or experiments, four copies of the instrument (ques- 
tionnaire, case, interview plan, or the like) should be submitted. 

3. Four copies should be submitted together with a check in U.S. funds for $50.00 for members or $100.00 
for nonmembers of the AAA made payable to the American Accounting Association. Effective January 
1990, the submission fee is nonrefundable. 

4, The author should retain a copy of the paper. 

§. Revisions must be submitted within 12 months from request, otherwise they si be considered new sub- 
missions. 


COMMENTS 


Comments on articles previously published in The Accounting Review will be reviewed (anonymously) by 
two reviewers in sequence. The first reviewer will be the author of the original article being subjected to cri- 
e. If substance permits, a suitably revised comment will be sent to a second reviewer to determine its 
publishability in The Accounting Review. If a comment is accepted for publication, the original author will be 
invited to reply. All other editorial requirements, as enumerated above, also apply to proposed comments. 


POLICY ON REPRODUCTION 


An objective of The Accounting Review is to promote the wide dissemination of the results of systematic 
scholarly inquiries into the broad field of accounting. 

Permission is hereby granted to reproduce any of the contents of the Review for use in courses of instruc- 
tion, as long as the source and American Accounting Association copyright are indicated in any such 
reproductions, 

Written application must be made to the Editor for permission to reproduce any of the oe of the 
- Review for use in other than courses of instruction—e.g., inclusion in books of readings or in any other 
publications intended for general distribution. In consideration for the grant of permission by the Review in 
such instances, the applicant must notify the author(s) in writing of the intended use to be made of each 
reproduction. Normally, the Review will not assess a charge for the waiver of copyright. 

Except where otherwise noted in articles, the copyright interest has been transferred to the American Ac- 
counting Association. Where the author(s) has (have) not transferred the copyright to the Association, ap- 
plicants must seek permission to reproduce (for all purposes) directly from the author(s). 


POLICY ON DATA AVAILABILITY 


The following policy has been adopted by the Executive Committee in its April 1989 meeting. 

“An objective of (The Accounting Review, Accounting Horizons, Issues in Accounting Education) is to pro- - 
vide the widest possible dissemination of knowledge based on systematic scholarly inquiries into accounting 
as a field of professional, research, and educational activity. As part of this process, authors are encouraged 
to make their data available for use by others in extending or replicating results reported in their articles. 
Authors of articles which report data dependent results should footnote the status of data availability and, 
when pertinent, this should be accompanied by information on how the data may be obtained.” 
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AMERICAN ACCOUNTING ASSOCIATION ` 
_ Announcement of ` 


| Fellowship Program i in  Aecounting 


- The purpose of the Fellowship Program is to increase the supply of qualified 
teachers.of accounting in the United States and Canada. Fellowships will be . 
awarded to assist individuals in furthering their preparation, through doc- 

toral studies, for teaching in colege and universities. i 


| ELIGIBILITY 


 L Expressed. interest in and outstanding See pe a career in teaching 
accounting. 

2. Acceptance in a doctoral program at a school accredited by the AACSB at 
the master's level. 

: Submission of a complete set of application forms. 
4. Only candidates entering the doctoral program are éligible. Actual awards 
will go only to students in the first year of their doctoral program. 

5. Foreign students are eligible for fellowships if.a resident of the U.S. or 
Canada at the time of application and are enrolled in or have a degree from 
‘a U.S. or Canadian accredited graduate program and plan to teach in the 
U.S. or Canada. 

6. All awards will be in the amount of $1,000 to be made to outstanding 
applicants based only an academic merits. 


APPLICATION 
1. The application must be complete in all respects; i:e., the required number 
of forms requested, EE letters, and graduate SE test 
scores. 
2. The deadline for receipt of all Rare materials is February 1. 
3. Requests for application forms should be addressed to: 
Marie Hamilton, Office Manager 
American Accounting Association 
` 6717 Bessie Drive ` 
Sarasota, Florida 34233-2399 


FIELD OF STUDY 
The doctoral program is expected to be in e or in another area 
suited to preparing the applicant for teaching accounting. 


DURATION 
Fellowships will be granted for one academic year. 


PAYMENT 
Payment of the academic year Fellowships will be made to the recipients at . 
the beginning of the first semester or quarter. The recipient must submit 
. evidence of on-campus enrollment. . i 


ANNOUNCEMENT OF AWARD. 
Awards will be announced on or about May 31. 
ADMINISTRATION 


The program is administered by a committee composed of members of the 
_ American Accounting deene i 
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